Home > News > Hardware

You Can't Always Get What You Want! NVIDIA's Most Powerful GPU Ships This Year with 25x Improvement in Cost and Energy Efficiency over Previous Generation

Xue Hua Sun, Mar 24 2024 09:28 AM EST

March 19th, in the realm of AI computational hardware, NVIDIA undeniably stands at the very top, with the dynamic evolution of its GPUs directly influencing the industry's progress.

At today's GTC 2024, Jensen Huang unveiled the new B200 GPU, boasting a staggering 20 petaflops of FP4 compute power, powered by a whopping 208 billion transistors.

As for the GB200, it combines two GPUs with a Grace CPU, providing 30 times the performance for LLM inference workloads, while potentially significantly enhancing efficiency.

NVIDIA also vividly illustrated the extent of improvement, such as the training of a model with 1.8 trillion parameters, previously requiring 8000 Hopper GPUs and 15 megawatts of power. Now, this task can be accomplished with just 2000 Blackwell GPUs, consuming only 4 megawatts of power.

In the benchmark test of GPT-3 LLM with 175 billion parameters, the GB200's performance is 7 times that of the H100, with a training speed 4 times faster. These significant performance boosts are facilitated by the new generation NVLink switch, enabling 576 GPUs to interconnect with a bidirectional bandwidth of 1.8TB per second.

According to Jensen Huang, the GB200 Grace Blackwell featuring the B200 chip is planned to ship later this year. NVIDIA has not yet disclosed the pricing information for the GB200 or the entire solution, but currently, even if you have the money, you might not be able to buy it, as Chinese vendors are also just looking on.

Huang expressed, "Blackwell brings an incredible excitement. We will be promoting Blackwell to AI companies globally, and Blackwell is being initiated with contracts globally. This Blackwell will be the most successful product launch in our history." s_88b4cb3390cd4bccb2de0ce77671367e.png The Blackwell platform enables the construction and operation of real-time generative AI on trillion-parameter-scale large language models (LLMs), with costs and energy consumption reduced by 25 times compared to its predecessor.