Release Notes

Add to my manuals
103 Pages

NVIDIA CUDA Toolkit v5.0 Release Notes

N=`expr $N3D + $NVGA - 1`

for i in `seq 0 $N`; do

mknod -m 666 /dev/nvidia$i c 195 $i;

done

mknod -m 666 /dev/nvidiactl c 195 255 else

exit 1 fi

‣

On some Linux releases, due to a GRUB bug in the handling of upper memory and a default

vmalloc

too small on 32-bit systems, it may be necessary to pass this information to the bootloader: vmalloc=256MB, uppermem=524288

Here is an example of GRUB conf: title Red Hat Desktop (2.6.9-42.ELsmp) root (hd0,0) uppermem 524288 kernel /vmlinuz-2.6.9-42.ELsmp ro root=LABEL=/1 rhgb quiet vmalloc=256MB pci=nommconf initrd /initrd-2.6.9-42.ELsmp.img

2.7. New Features

2.7.1. General CUDA

‣

Support compatibility between CUDA driver and CUDA toolkit is as follows:

‣

Any

nvcc

generated PTX code is forward compatible to newer GPU architectures. This means any CUDA binaries that include PTX code will continue to run on newer GPUs and newer CUDA drivers released from

NVIDIA; as the PTX code gets JIT compiled at runtime to the newer GPU architecture.

‣

CUDA drivers are backward compatible with CUDA toolkit. This means systems can be upgraded to newer drivers independent of upgrading to newer toolkit. Apps built using old toolkit will load and run with the newer drivers however if they require PTX JIT compilation to run on a newer GPU architecture

(SM version) then such apps cannot be used with CUDA tools from old toolkit.

Any JIT compiled code implies using the newer compiler and thus a new ABI which requires upgrading to the matching newer toolkit and associated tools.

‣

Any separately compiled NVCC binaries (enabled in 5.0) require that all device objects follow the same ABI, and must target the same GPU architecture (SM version). Any CUDA tools usage on these binaries must match the associated toolkit version of the compiler.

‣

The CUDA 4.2 toolkit for sm_30 implicitly increased a -maxrregcount that was less than 32 to 32. The CUDA 5.0 toolkit does not implicitly increase the -maxrregcount unless it is less than 16 (because the ABI requires at least 16 registers). Note that 32 is the "best minimum" for sm_3x, and the libcublas_device library is compiled for 32 registers.

www.nvidia.com

NVIDIA CUDA Toolkit v5.5

RN-06722-001 _v5.5 | 28

Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Release Notes

2.7. New Features

2.7.1. General CUDA

Related manuals

Table of contents

Release Notes

2.7. New Features

2.7.1. General CUDA

Related manuals

Nvidia

Tesla S2050

Softing

OPC UA C++ SDK for Linux

Extreme Networks

NSight

Table of contents