As the AI revolution gains momentum, NVIDIA founder and CEO Jensen Huang took the stage on Tuesday in Beijing to show the latest technology for accelerating its mass adoption.
Huang kicked off the GPU Technology Conference (GTC) China 2017 and said that with the emergence of GPU computing following the decline of the CPU era, Moore's Law has come to an end, stressing that his company's GPU-centered ecosystem has won support from China's top-five AI (artificial intelligence) players.
Moore's Law reflects his observation that the number of transistors in a dense integrated circuit doubles approximately every two years.
Huang said now is an era beyond Moore's Law, which has become outdated. He stressed that both GPU computing capability and neural network performance are developing at a faster pace than set in Moore's Law.
He stressed that while the number of CPU transistors has grown at an annual pace of 50%, the CPU performance has advanced by only 10%, adding that designers can hardly work out more advanced parallel instruction set architectures for CPU and therefore GPU will soon replace CPU.
On the other hand, Intel recently said that Moore's Law will not fail, during its Technology and Manufacturing Day held also in Beijing, on September 19, when the company provided updates for its 10nm process.
GPU for AI applications
Huang said that Nvidia's GPU can fill up the deficiency of CPU, as strengthening high intensity computational capacity is the most ideal solutions for AI application scenarios.
He also disclosed that Alibaba, Baidu, Tencent, JD.com, and iFLYTEK, now China's top 5 e-commerce and AI players, have adopted the Nvidia Volta GPU architectures to support their cloud services, while Huawei, Inspur and Lenovo have also deployed the firm's HGX-based GPU servers.
At the conference, NVIDIA also unveiled TensorRT 3 AI inferencing software, which runs a trained neural network in a production environment. The new software promises to boost the performance and slash the cost of inferencing from the cloud to edge devices, including self-driving cars and robots.
Nvidia's executive claims that the combination of TensorRT 3 with NVIDIA GPUs delivers "the world's fastest inferencing on the widely used TensorFlow framework for AI-enabled services", such as image and speech recognition, natural language processing, visual search and personalized recommendations. "Coupled with our Tesla V100 GPU accelerators, TensorRT can process as many as 5,700 images a second, versus just 140 using today's CPUs," Huang said.
The speed and efficiency TensorRT 3 offers when paired with NVIDIA GPUs translates into incredible savings, Huang explained. It takes 160 dual-CPU servers - costing $600,000 to $700,000, including networking and power delivery - that consume 65 kilowatts of power, to crank through 45,000 images per second.
By contrast, the same work can be done with a single NVDIA HGX server equipped with eight Tesla V100 GPUs that consume just 3 kilowatts of power.
To further accelerate AI, NVIDIA introduced additional software, including:
DeepStream SDK:NVIDIA DeepStream SDK delivers real-time, low-latency video analytics at scale. It helps developers integrate video inference capabilities, including INT8 precision and GPU-accelerated transcoding, to support AI-powered services like object classification and scene understanding for up to 30 HD streams in real time on a single Tesla P4 GPU accelerator.
CUDA 9: The latest version of CUDA, NVIDIA's accelerated computing software platform, speeds up HPC and deep learning applications with support for NVIDIA Volta architecture-based GPUs, up to 5x faster libraries, a new programming model for thread management and updates to debugging and profiling tools. CUDA 9 is optimized to deliver maximum performance on Tesla V100 GPU accelerators.
Nvidia is now teaming up with Huawei, Inspur and Lenovo to develop Tesla 100 HGX-1 accelerator dedicated to AI application.