Release Notes


Add to my manuals
103 Pages

advertisement

Release Notes | Manualzz

NVIDIA CUDA Toolkit v5.0 Release Notes

N=`expr $N3D + $NVGA - 1`

for i in `seq 0 $N`; do

mknod -m 666 /dev/nvidia$i c 195 $i;

done

mknod -m 666 /dev/nvidiactl c 195 255 else

exit 1 fi

On some Linux releases, due to a GRUB bug in the handling of upper memory and a default

vmalloc

too small on 32-bit systems, it may be necessary to pass this information to the bootloader: vmalloc=256MB, uppermem=524288

Here is an example of GRUB conf: title Red Hat Desktop (2.6.9-42.ELsmp) root (hd0,0) uppermem 524288 kernel /vmlinuz-2.6.9-42.ELsmp ro root=LABEL=/1 rhgb quiet vmalloc=256MB pci=nommconf initrd /initrd-2.6.9-42.ELsmp.img

2.7. New Features

2.7.1. General CUDA

Support compatibility between CUDA driver and CUDA toolkit is as follows:

Any

nvcc

generated PTX code is forward compatible to newer GPU architectures. This means any CUDA binaries that include PTX code will continue to run on newer GPUs and newer CUDA drivers released from

NVIDIA; as the PTX code gets JIT compiled at runtime to the newer GPU architecture.

CUDA drivers are backward compatible with CUDA toolkit. This means systems can be upgraded to newer drivers independent of upgrading to newer toolkit. Apps built using old toolkit will load and run with the newer drivers however if they require PTX JIT compilation to run on a newer GPU architecture

(SM version) then such apps cannot be used with CUDA tools from old toolkit.

Any JIT compiled code implies using the newer compiler and thus a new ABI which requires upgrading to the matching newer toolkit and associated tools.

Any separately compiled NVCC binaries (enabled in 5.0) require that all device objects follow the same ABI, and must target the same GPU architecture (SM version). Any CUDA tools usage on these binaries must match the associated toolkit version of the compiler.

The CUDA 4.2 toolkit for sm_30 implicitly increased a -maxrregcount that was less than 32 to 32. The CUDA 5.0 toolkit does not implicitly increase the -maxrregcount unless it is less than 16 (because the ABI requires at least 16 registers). Note that 32 is the "best minimum" for sm_3x, and the libcublas_device library is compiled for 32 registers.

www.nvidia.com

NVIDIA CUDA Toolkit v5.5

RN-06722-001 _v5.5 | 28

advertisement

Was this manual useful for you? Yes No
Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Related manuals

advertisement

Table of contents