AILAB Blog: Blackwell

4.12.2024

Intel's Gaudi 3 Goes After Nvidia's Crown: A Deep Dive into the AI Chip Showdown

The battle for AI supremacy is heating up, and the latest battleground is the AI accelerator chip. At its Vision 2024 event, Intel unveiled the much-anticipated Gaudi 3, a significant upgrade to its AI chip line promising to challenge Nvidia's dominance. Let's delve deeper into the details of Gaudi 3 and see how it stacks up against the competition.

Gaudi 3 Architecture: Doubling Down on Performance

Gaudi 3 takes a significant leap from its predecessor, Gaudi 2. Instead of a single chip, it boasts a dual-chip design connected by a high-bandwidth link. Each chip features a central cache of 48 megabytes surrounded by a dedicated AI processing unit. This unit comprises four matrix multiplication engines and 32 programmable tensor processor cores. The entire package is integrated with high-speed memory connections and capped with media processing and networking capabilities.

This innovative architecture translates to double the AI processing power of Gaudi 2. Additionally, Gaudi 3 leverages 8-bit floating-point arithmetic, a key element in training powerful transformer models used in large language processing (LLMs). For computations using the BFloat16 format, Gaudi 3 offers a remarkable fourfold performance boost.

Gaudi 3 vs. Nvidia H100: A Tale of LLMs and Efficiency

One of Gaudi 3's biggest strengths lies in its performance with large language models. Intel claims a 40% faster training time for the massive GPT-3 175B LLM compared to Nvidia's H100 chip. This advantage extends to smaller LLM versions like the 7-billion and 8-billion parameter Llama2 models.

For inference tasks, the competition gets closer. Gaudi 3 delivers between 95% and 170% of the H100's performance for specific Llama versions. However, for the Falcon 180B model, Gaudi 3 shines with a staggering fourfold advantage.

But where Gaudi 3 truly separates itself is in power efficiency. Intel claims significant improvements, reaching up to 230% better than H100 for specific LLM workloads. This translates to substantial cost savings on data center electricity bills – a crucial factor for large-scale AI deployments.

The Memory Question: Gaudi 3 vs. The Competition

One area where the picture gets murkier is memory. Both Gaudi 3 and Nvidia chips utilize high-bandwidth memory (HBM). However, Gaudi 3 relies on the slightly older HBM2e version, while Nvidia utilizes the newer HBM3 or HBM3e options in some models. While HBM2e might be more cost-effective, it could potentially impact performance in bandwidth-intensive tasks.

The memory capacity also varies. Gaudi 3 boasts more HBM than H100 but falls short compared to Nvidia's upcoming Blackwell B200, H200, and AMD's MI300. This is an aspect to consider depending on the specific AI workload requirements.

Process Technology: Closing the Gap

For generations, Intel's Gaudi chips have lagged behind Nvidia in terms of process technology. This meant comparing Gaudi to a chip built on a more advanced "rung" of Moore's Law. Fortunately, Gaudi 3 utilizes the TSMC N5 (5-nanometer) process, finally matching the current generation of Nvidia chips like H100 and H200.

While Nvidia is expected to move to the N4P process for the upcoming Blackwell, it still falls within the same 5-nm family as Gaudi 3. This signifies that Intel is steadily closing the gap in manufacturing technology.

The Future of AI Chips: Gaudi vs. Blackwell

The battle between Gaudi and Nvidia continues. While Gaudi 3 offers compelling advantages in power efficiency, LLM performance, and potentially competitive pricing, the true test will come with the release of Nvidia's Blackwell. Its exact capabilities and how it stacks up against Gaudi 3 remain to be seen.

One intriguing factor is the future of Gaudi technology. The next generation, codenamed Falcon Shores, is expected to remain on TSMC's technology for now. However, Intel plans to introduce its own 18A process technology next year, potentially giving future Gaudi chips a significant edge.

Conclusion: Gaudi 3 - A Viable Contender in the AI Chip Race

Intel's Gaudi 3 marks a significant step forward for the company's AI chip ambitions. With its focus on LLM performance, power efficiency, and potentially competitive

3.18.2024

Revolutionizing AI: Nvidia's Leap with Hopper and Blackwell Chips

In an electrifying presentation at the GTC keynote in San Jose, Nvidia's CEO Jensen Huang unveiled a series of groundbreaking advancements in AI technology that promise to redefine the landscape of computing. The spotlight shone brightly on Nvidia's latest AI-infused chips, particularly the Hopper and Blackwell platforms, marking a significant leap forward in the company's pursuit of computational excellence.

Hopper: A Game Changer

The Hopper chip, with its staggering 28 billion transistors, has already made its mark by changing the world. Its design and capabilities have set new benchmarks for what we can expect from GPUs, transcending traditional boundaries and expectations. The chip's architecture, named after the pioneering computer scientist Grace Hopper, embodies Nvidia's commitment to innovation and excellence in the field of computing.

Introducing Blackwell: The Next Evolution

Blackwell, named to signify a platform rather than just a chip, represents the future of Nvidia's GPU technology. This isn't merely an iteration of past designs; it's a revolutionary step forward. Featuring a unique dual-die design, Blackwell allows for 10 terabytes per second of data flow between the dies, effectively making them operate as a single, colossal chip. This breakthrough addresses critical challenges like memory locality and cache issues, paving the way for more efficient and powerful computing solutions.

Seamless Integration and Scalability

One of the most compelling aspects of Blackwell is its seamless integration with existing systems. It is form, fit, and function compatible with Hopper, meaning that installations worldwide can easily upgrade to Blackwell without the need for significant infrastructure changes. This compatibility ensures an efficient transition and leverages the global presence of Hopper installations, promising rapid adoption and scalability.

Pushing Boundaries with MVY Link Switch

Nvidia didn't stop at Blackwell. The announcement of the MVY link switch chip, with its 50 billion transistors, showcased Nvidia's ambition to push the boundaries of what's possible. This chip enables full-speed communication between GPUs, facilitating the creation of systems that operate with unprecedented efficiency and power.

Partnerships and Ecosystems

The keynote also highlighted Nvidia's collaborative efforts with industry giants like AWS, Google, Oracle, and Microsoft, all gearing up to integrate Blackwell into their operations. These partnerships underscore the widespread impact and potential applications of Nvidia's new technologies across various sectors, from cloud computing to healthcare.

A New Era for Generative AI

Central to Nvidia's announcements was the emphasis on generative AI. The new processors are designed to accelerate and enhance generative AI applications, from content token generation with the FP4 format to the creation of sophisticated AI models. Nvidia's AI Foundry initiative further solidifies this focus, aiming to provide comprehensive solutions for AI development and deployment.

Project Groot and the Future of Robotics

Among the futuristic innovations presented was Project Groot, a foundation model for humanoid robots. This initiative underscores Nvidia's vision for a future where robots can learn from human demonstrations and assist with everyday tasks, powered by the new Jetson Thor robotics chips.

Conclusion: A Future Defined by AI

Nvidia's announcements at the GTC keynote are more than just a showcase of new products; they represent a bold vision for the future of computing. With the introduction of the Hopper and Blackwell chips, along with the MVY link switch and initiatives like AI Foundry, Nvidia is not just keeping pace with the advancements in AI; it's setting the pace. As these technologies begin to permeate various industries, the potential for transformative change is immense, promising a future where AI is not just a tool but a fundamental aspect of our digital lives.