NVIDIA today made available
the NVIDIA CUDA 5 production release, a new version of its parallel computing platform and programming model for accelerating scientific and engineering applications on GPUs.
The new programming features of the CUDA 5 platform make the development of GPU-accelerated applications faster and easier than before, including support for dynamic parallelism, GPU-callable libraries, NVIDIA GPUDirect technology support for RDMA (remote direct memory access) and the NVIDIA Nsight Eclipse Edition integrated development environment (IDE).
Key features include:
- Dynamic Parallelism - GPU threads can dynamically spawn new threads, allowing the GPU to adapt to the data. By minimizing the back and forth with the CPU, dynamic parallelism greatly simplifies parallel programming. And it enables GPU acceleration of a broader set of popular algorithms, such as those used in adaptive mesh refinement and computational fluid dynamics applications.
- GPU-Callable Libraries - A new CUDA BLAS library allows developers to use dynamic parallelism for their own GPU-callable libraries. They can design plug-in APIs that allow other developers to extend the functionality of their kernels, and allow them to implement callbacks on the GPU to customize the functionality of third-party GPU-callable libraries. The "object linking" capability provides a familiar process for developing large GPU applications by enabling developers to compile multiple CUDA source files into separate object files, and link them into larger applications and libraries.
- GPUDirect Support for RDMA - GPUDirect technology enables direct communication between GPUs and other PCI-E devices, and supports direct memory access between network interface cards and the GPU. It also reduces MPISendRecv latency between GPU nodes in a cluster and improves overall application performance.
- NVIDIA Nsight Eclipse Edition - NVIDIA Nsight Eclipse Edition enables programmers to develop, debug and profile GPU applications within the familiar Eclipse-based IDE on Linux and Mac OS X platforms. An integrated CUDA editor and CUDA samples speed the generation of CUDA code, and automatic code refactoring enables easy porting of CPU loops to CUDA kernels. An integrated expert analysis system provides automated performance analysis and step-by-step guidance to fix performance bottlenecks in the code, while syntax highlighting makes it easy to differentiate GPU code from CPU code.
NVIDIA has also launched a free online resource center for CUDA programmers at http://docs.nvidia.com. The site offers the latest information on the CUDA platform and programming model, as well as access to all CUDA developer documentation and technologies, including tools, code samples, libraries, APIs, and tuning and programming guides.