advertisement
NVIDIA CUDA Toolkit v5.0 Release Notes
N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i;
done
mknod -m 666 /dev/nvidiactl c 195 255 else
exit 1 fi
‣
On some Linux releases, due to a GRUB bug in the handling of upper memory and a default
vmalloc
too small on 32-bit systems, it may be necessary to pass this information to the bootloader: vmalloc=256MB, uppermem=524288
Here is an example of GRUB conf: title Red Hat Desktop (2.6.9-42.ELsmp) root (hd0,0) uppermem 524288 kernel /vmlinuz-2.6.9-42.ELsmp ro root=LABEL=/1 rhgb quiet vmalloc=256MB pci=nommconf initrd /initrd-2.6.9-42.ELsmp.img
2.7. New Features
2.7.1. General CUDA
‣
Support compatibility between CUDA driver and CUDA toolkit is as follows:
‣
Any
nvcc
generated PTX code is forward compatible to newer GPU architectures. This means any CUDA binaries that include PTX code will continue to run on newer GPUs and newer CUDA drivers released from
NVIDIA; as the PTX code gets JIT compiled at runtime to the newer GPU architecture.
‣
CUDA drivers are backward compatible with CUDA toolkit. This means systems can be upgraded to newer drivers independent of upgrading to newer toolkit. Apps built using old toolkit will load and run with the newer drivers however if they require PTX JIT compilation to run on a newer GPU architecture
(SM version) then such apps cannot be used with CUDA tools from old toolkit.
Any JIT compiled code implies using the newer compiler and thus a new ABI which requires upgrading to the matching newer toolkit and associated tools.
‣
Any separately compiled NVCC binaries (enabled in 5.0) require that all device objects follow the same ABI, and must target the same GPU architecture (SM version). Any CUDA tools usage on these binaries must match the associated toolkit version of the compiler.
‣
The CUDA 4.2 toolkit for sm_30 implicitly increased a -maxrregcount that was less than 32 to 32. The CUDA 5.0 toolkit does not implicitly increase the -maxrregcount unless it is less than 16 (because the ABI requires at least 16 registers). Note that 32 is the "best minimum" for sm_3x, and the libcublas_device library is compiled for 32 registers.
www.nvidia.com
NVIDIA CUDA Toolkit v5.5
RN-06722-001 _v5.5 | 28
advertisement
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Related manuals
advertisement
Table of contents
- 9 Chapter 1. NVIDIA CUDA Toolkit v5.5 Release Notes
- 9 1.1. Errata
- 9 1.1.1. General CUDA
- 9 1.1.2. CUDA Libraries
- 9 1.1.2.1. CUBLAS
- 10 1.1.2.2. CUFFT
- 13 1.1.3. CUDA Samples
- 14 1.1.4. CUDA Tools
- 15 1.2. Documentation
- 15 1.3. List of Important Files
- 16 1.3.1. Core Files
- 17 1.3.2. Windows lib Files
- 17 1.3.3. Linux lib Files
- 17 1.3.4. Mac OS X lib Files
- 17 1.4. Supported NVIDIA Hardware
- 18 1.5. Supported Operating Systems
- 18 1.5.1. Windows
- 18 1.5.2. Linux
- 19 1.5.3. Mac OS X
- 19 1.6. Installation Notes
- 19 1.6.1. Windows
- 19 1.6.2. Linux
- 20 1.7. Deprecated Features
- 20 1.8. New Features
- 20 1.8.1. General CUDA
- 21 1.8.2. CUDA Libraries
- 21 1.8.2.1. CUBLAS
- 22 1.8.2.2. CUFFT
- 22 1.8.2.3. CURAND
- 22 1.8.2.4. CUSPARSE
- 22 1.8.2.5. Thrust
- 22 1.8.3. CUDA Tools
- 22 1.8.3.1. CUDA Compiler
- 23 1.8.3.2. CUDA-GDB
- 24 1.8.3.3. CUDA-MEMCHECK
- 24 1.8.3.4. CUDA Profiler
- 25 1.8.3.5. Debugger API
- 25 1.8.3.6. Nsight Eclipse Edition
- 26 1.9. Performance Improvements
- 26 1.9.1. CUDA Libraries
- 26 1.9.1.1. CUBLAS
- 26 1.9.1.2. Math
- 26 1.10. Resolved Issues
- 26 1.10.1. General CUDA
- 27 1.10.2. CUDA Libraries
- 27 1.10.2.1. NPP
- 27 1.10.3. CUDA Tools
- 27 1.10.3.1. CUDA-GDB
- 27 1.10.3.2. Debugger API
- 27 1.11. Known Issues
- 27 1.11.1. Linux on ARMv7 Specific Issues
- 28 1.11.2. General CUDA
- 28 1.11.3. CUDA Libraries
- 28 1.11.3.1. NPP
- 28 1.11.4. CUDA Tools
- 28 1.11.4.1. CUDA Compiler
- 28 1.11.4.2. CUDA Profiler
- 29 1.12. Source Code for Open64 and CUDA-GDB
- 29 1.13. More Information
- 30 Chapter 2. NVIDIA CUDA Toolkit v5.0 Release Notes
- 30 2.1. Errata
- 30 2.1.1. Known Issues
- 30 2.1.1.1. General CUDA
- 31 2.1.1.2. CUDA Libraries
- 31 2.1.1.3. CUDA Tools
- 32 2.2. Documentation
- 32 2.3. List of Important Files
- 32 2.3.1. Core Files
- 33 2.3.2. Windows lib Files
- 33 2.3.3. Linux lib Files
- 33 2.3.4. Mac OS X lib Files
- 34 2.4. Supported NVIDIA Hardware
- 34 2.5. Supported Operating Systems
- 34 2.5.1. Windows
- 34 2.5.2. Linux
- 35 2.5.3. Mac OS X
- 35 2.6. Installation Notes
- 35 2.6.1. Windows
- 35 2.6.2. Linux
- 36 2.7. New Features
- 36 2.7.1. General CUDA
- 37 2.7.1.1. Linux
- 38 2.7.2. CUDA Libraries
- 38 2.7.2.1. CUBLAS
- 38 2.7.2.2. CURAND
- 38 2.7.2.3. CUSPARSE
- 39 2.7.2.4. Math
- 40 2.7.2.5. NPP
- 40 2.7.3. CUDA Tools
- 40 2.7.3.1. CUDA Compiler
- 41 2.7.3.2. CUDA-GDB
- 41 2.7.3.3. CUDA-MEMCHECK
- 41 2.7.3.4. NVIDIA Nsight Eclipse Edition
- 41 2.7.3.5. NVIDIA Visual Profiler, Command Line Profiler
- 42 2.8. Performance Improvements
- 42 2.8.1. CUDA Libraries
- 42 2.8.1.1. CUBLAS
- 42 2.8.1.2. CURAND
- 42 2.8.1.3. Math
- 42 2.9. Resolved Issues
- 43 2.9.1. General CUDA
- 43 2.9.2. CUDA Libraries
- 43 2.9.2.1. CURAND
- 43 2.9.2.2. CUSPARSE
- 44 2.9.2.3. NPP
- 44 2.9.2.4. Thrust
- 44 2.9.3. CUDA Tools
- 44 2.9.3.1. CUDA Compiler
- 44 2.9.3.2. CUDA Occupancy Calculator
- 45 2.10. Known Issues
- 45 2.10.1. General CUDA
- 45 2.10.1.1. Linux, Mac OS
- 46 2.10.1.2. Windows
- 46 2.10.2. CUDA Libraries
- 46 2.10.2.1. NPP
- 46 2.10.3. CUDA Tools
- 46 2.10.3.1. CUDA Compiler
- 47 2.10.3.2. NVIDIA Visual Profiler, Command Line Profiler
- 48 2.11. Source Code for Open64 and CUDA-GDB
- 48 2.12. More Information
- 49 Chapter 3. NVIDIA CUDA Toolkit v4.2 Release Notes
- 49 3.1. Errata
- 49 3.1.1. Known Issues
- 49 3.2. Release Highlights
- 50 3.3. Documentation
- 50 3.4. List of Important Files
- 51 3.4.1. Windows lib Files
- 51 3.4.2. Linux lib Files
- 51 3.4.3. Mac OS X lib Files
- 51 3.5. Supported NVIDIA Hardware
- 51 3.6. Supported Operating Systems
- 51 3.6.1. Windows
- 52 3.6.2. Linux
- 53 3.6.3. Mac OS X
- 53 3.7. Installation Notes
- 53 3.7.1. Windows
- 53 3.7.2. Linux
- 54 3.8. New Features
- 54 3.9. Resolved Issues
- 55 3.10. Known Issues
- 55 3.10.1. Windows
- 55 3.10.2. Linux & Mac
- 56 3.10.3. Mac
- 56 3.10.4. Visual Profiler and Command Line Profiler
- 57 3.11. Source Code for Open64 and CUDA-GDB
- 57 3.12. More Information
- 58 Chapter 4. NVIDIA CUDA Toolkit v4.1 Release Notes
- 58 4.1. Release Highlights
- 59 4.2. Documentation
- 59 4.3. List of Important Files
- 60 4.3.1. Windows lib Files
- 60 4.3.2. Linux lib Files
- 60 4.3.3. Mac OS X lib Files
- 60 4.4. Supported NVIDIA Hardware
- 61 4.5. Supported Operating Systems
- 61 4.5.1. Windows
- 61 4.5.2. Linux
- 62 4.5.3. Mac OS X
- 62 4.6. Installation Notes
- 62 4.6.1. Windows
- 62 4.6.2. Linux
- 63 4.7. Upgrading from Previous CUDA Toolkit
- 63 4.7.1. Vista, Server 2008 and Windows 7 Related
- 64 4.7.2. Linux and Mac
- 64 4.7.3. Mac Related
- 64 4.8. CUDA Toolkit Known Issues
- 64 4.8.1. SDK Related
- 65 4.8.2. Visual Profiler and Command Line Profiler
- 67 4.8.3. CUDA-MEMCHECK
- 67 4.9. New Features in CUDA Release
- 67 4.9.1. CUDA Runtime
- 67 4.9.2. Compiler Related
- 68 4.9.3. CUDA Libraries
- 70 4.9.4. CUDA Driver
- 71 4.10. Performance Improvements in CUDA Release
- 72 4.11. Resolved Issues
- 74 4.12. Source Code for Open64 and CUDA-GDB
- 74 4.13. More Information
- 75 4.14. Acknowledgements
- 76 Chapter 5. NVIDIA CUDA Toolkit v4.0 Release Notes
- 76 5.1. Release Highlights
- 77 5.2. Documentation
- 77 5.3. Errata for Windows, Linux, and Mac OS X
- 77 5.3.1. Linux
- 77 5.3.2. Resolved Issues
- 77 5.3.3. Known Issues
- 79 5.3.4. More Information
- 79 5.4. List of Important Files
- 80 5.4.1. Windows lib Files
- 80 5.4.2. Linux lib Files
- 80 5.4.3. Mac OS X lib Files
- 80 5.5. Supported NVIDIA Hardware
- 81 5.6. Supported Operating Systems for Windows, Linux, and Mac OS X
- 81 5.6.1. Windows
- 81 5.6.2. Linux
- 82 5.6.3. Mac OS X
- 82 5.7. Installation Notes
- 82 5.7.1. Windows
- 82 5.7.2. Linux
- 83 5.8. Upgrading from Previous CUDA Toolkit
- 83 5.9. Notes on New Features and Performance Improvements
- 83 5.9.1. CUDA Driver Features
- 87 5.9.2. CUDA Compiler Features
- 88 5.9.3. CUDA Libraries Features
- 91 5.9.4. CUDA Libraries Performance
- 92 5.10. Known Issues
- 94 5.10.1. Vista, Server 2008 and Windows 7 Related
- 94 5.10.2. XP, Vista, Server 2008 and Windows 7 Related
- 95 5.10.3. XP Related
- 95 5.10.4. Linux Only
- 96 5.10.5. Linux and Mac
- 96 5.10.6. Mac Only
- 97 5.11. Resolved Issues
- 98 5.11.1. Mac Related
- 102 5.12. Source Code for Open64 and CUDA-GDB
- 102 5.13. More Information
- 102 5.14. Acknowledgements