AI Models & Platforms

What to Know About NVIDIA’s New Blackwell AI Superchip and Architecture

Published March 21, 2024

Alex McFarland

GB200 Grace Blackwell Superchip (NVIDIA)

NVIDIA, a vanguard in the AI and GPU market, has recently announced the launch of its latest innovation, the Blackwell B200 GPU, along with its more powerful counterpart, the GB200 super chip, as well as other impressive tools that make up the Blackwell Architecture. This announcement marks a significant leap forward in AI processing capabilities, reinforcing NVIDIA’s influential position in a highly competitive industry. The introduction of the Blackwell B200 and GB200 comes at a time when the demand for more advanced AI solutions is surging, with NVIDIA poised to meet this demand head-on.

Blackwell B200: A New Era in AI Processing

At the core of NVIDIA’s latest innovation is the Blackwell B200 GPU, a marvel of engineering boasting an unprecedented 20 petaflops of FP4 processing power, backed by a staggering 208 billion transistors. This superchip stands as a testament to NVIDIA’s relentless pursuit of technological excellence, setting new standards in the realm of AI processing.

When compared to its predecessors, the B200 GPU represents a monumental leap in both efficiency and performance. NVIDIA’s continued commitment to innovation is evident in this new chip’s ability to handle large-scale AI models more efficiently than ever before. This efficiency is not just in terms of processing speed but also in terms of energy consumption, a crucial factor in today’s environmentally conscious market.

NVIDIA’s breakthrough in AI chip technology is also reflected in the pricing of the Blackwell B200, which is tentatively set between $30,000 and $40,000. While this price point underscores the chip’s advanced capabilities, it also signals NVIDIA’s confidence in the value these superchips bring to the ever-evolving AI sector.

GB200 Superchip: The Power Duo

NVIDIA also introduced the GB200 superchip, an amalgamation of dual Blackwell B200 GPUs synergized with a Grace CPU. This powerful trio represents a groundbreaking advancement in AI supercomputing. The GB200 is more than just a sum of its parts; it is a cohesive unit designed to tackle the most complex and demanding AI tasks.

The GB200 stands out for its astonishing performance capabilities, particularly in Large Language Model (LLM) inference workloads. NVIDIA reports that the GB200 delivers up to 30 times the performance of its predecessor, the H100 model. This quantum leap in performance metrics is a clear indicator of the GB200’s potential to revolutionize the AI processing landscape.

Beyond its raw performance, the GB200 superchip also sets a new benchmark in energy and cost efficiency. Compared to the H100 model, it promises to significantly reduce both operational costs and energy consumption. This efficiency is not just a technical achievement but also aligns with the growing demand for sustainable and cost-effective computing solutions in AI.

Advancements in Connectivity and Network

The GB200’s second-gen transformer engine plays a pivotal role in enhancing compute, bandwidth, and model size. By optimizing neuron representation from eight bits to four, the engine effectively doubles the computing capacity, bandwidth, and model size. This innovation is key to managing the ever-increasing complexity and scale of AI models, ensuring that NVIDIA stays ahead in the AI race.

A notable advancement in the GB200 is the enhanced NVLink switch, designed to improve inter-GPU communication significantly. This innovation allows for a higher degree of efficiency and scalability in multi-GPU configurations, addressing one of the key challenges in high-performance computing.

One of the most critical enhancements in the GB200 architecture is the substantial reduction in communication overhead, particularly in multi-GPU setups. This efficiency is crucial in optimizing the performance of large-scale AI models, where inter-chip communication can often be a bottleneck. By minimizing this overhead, NVIDIA ensures that more computational power is directed towards actual processing tasks, making AI operations more streamlined and effective.

GB200 NVL72 (NVIDIA)

Packaging Power: The NVL72 Rack

For companies looking to buy a large quantity of GPUs, the NVL72 rack emerges as a significant addition to NVIDIA’s arsenal, exemplifying state-of-the-art design in high-density computing. This liquid-cooled rack is engineered to house multiple CPUs and GPUs, representing a robust solution for intensive AI processing tasks. The integration of liquid cooling is a testament to NVIDIA’s innovative approach to handling the thermal challenges posed by high-performance computing environments.

A key attribute of the NVL72 rack is its capability to support extremely large AI models, crucial for advanced applications in areas like natural language processing and computer vision. This ability to accommodate and efficiently run colossal AI models positions the NVL72 as a critical infrastructure component in the realm of cutting-edge AI research and development.

NVIDIA’s NVL72 rack is set to be integrated into the cloud services of major technology corporations, including Amazon, Google, Microsoft, and Oracle. This integration signifies a major step in making high-end AI processing power more accessible to a broader range of users and applications, thereby democratizing access to advanced AI capabilities.

Beyond AI Processing into AI Vehicles and Robotics

NVIDIA is extending its technological prowess beyond traditional computing realms into the sectors of AI-enabled vehicles and humanoid robotics.

Project GR00T and Jetson Thor stand at the forefront of NVIDIA’s venture into robotics. Project GR00T aims to provide a foundational model for humanoid robots, enabling them to understand natural language and emulate human movements. Paired with Jetson Thor, a system-on-a-chip designed specifically for robotics, these initiatives mark NVIDIA’s ambition to create autonomous machines capable of performing a wide range of tasks with minimal human intervention.

Another intriguing development is that NVIDIA introduced a simulation of a quantum computing service. While not directly connected to an actual quantum computer, this service utilizes NVIDIA’s AI chips to simulate quantum computing environments. This initiative offers researchers a platform to test and develop quantum computing solutions without the need for costly and scarce quantum computing resources. Looking ahead, NVIDIA plans to provide access to third-party quantum computers, marking its foray into one of the most advanced fields in computing.

NVIDIA Continues to Reshape the AI Landscape

NVIDIA’s introduction of the Blackwell B200 GPU and GB200 superchip marks yet another transformative moment in the field of artificial intelligence. These advancements are not mere incremental updates; they represent a significant leap in AI processing capabilities. The Blackwell B200, with its unparalleled processing power and efficiency, sets a new benchmark in the industry. The GB200 superchip further elevates this standard by offering unprecedented performance, particularly in large-scale AI models and inference workloads.

The broader implications of these developments extend far beyond NVIDIA’s portfolio. They signal a shift in the technological capabilities available for AI development, opening new avenues for innovation across various sectors. By significantly enhancing processing power while also focusing on energy efficiency and scalability, NVIDIA’s Blackwell series lays the groundwork for more sophisticated, sustainable, and accessible AI applications.

This leap forward by NVIDIA is likely to accelerate advancements in AI, driving the industry towards more complex, real-world applications, including AI-enabled vehicles, advanced robotics, and even explorations into quantum computing simulations. The impact of these innovations will be felt across the technology landscape, challenging existing paradigms and paving the way for a future where AI’s potential is limited only by the imagination.