Home > News > Hardware

Intel Unveils Gaudi 3 AI Accelerator: 4x Performance Boost, Outperforms NVIDIA H100

Shang Fang Wen Q Fri, Apr 12 2024 09:19 AM EST

On April 10th, Intel hosted the Intel Vision 2024 Industry Innovation Summit for customers and partners, making several significant announcements, including the introduction of the new Gaudi 3 AI accelerator, the launch of the new Xeon 6 brand, and the unveiling of a comprehensive stack of solutions covering new open, scalable systems, next-generation products, and a series of strategic partnerships.

According to data, the global semiconductor market is expected to reach $1 trillion by 2030, with AI being a major driving force. However, as of 2023, only 10% of enterprises have successfully commercialized their AIGC projects.

Intel's latest solutions are poised to assist enterprises in addressing the challenges of deploying AI projects, accelerating the commercialization of AIGC applications. 09729760-76de-45df-929a-1c00160821d8.png Intel's Gaudi 2, born in May 2022, made its official debut in China in July 2023, boasting exceptional deep learning performance, efficiency, and cost-effectiveness.

Fabricated on TSMC's 7nm process, it integrates 24 programmable Tenor Processing Cores (TPCs), 48MB of SRAM cache, 21 100G Ethernet interfaces (ROCEv2 RDMA), 96GB of HBM2E high-bandwidth memory (with a total bandwidth of 2.4TB/s), multimedia engines, and supports PCIe 4.0 x16. With a maximum power consumption of 800W, it fulfills the intense computational demands of large-scale language models and generative AI models. S5a90d4c8-d914-4146-9b1c-bc4a96597eea.png The next generation Gaudi 3, tailored for AI training and inference, has been upgraded to TSMC's 5nm process, delivering twice the FP8 AI compute, four times the BF16 AI compute, double the network bandwidth, and 1.5 times the memory bandwidth.

Compared to the NVIDIA H100, it boasts a 50% lead in popular LLM inference performance and trains 40% faster.

Gaudi 3 is expected to significantly reduce training time for the Llama2 model with 7 billion to 13 billion parameters and the GPT-3 model with 175 billion parameters.

In addition, for large language models like Llama with 7 billion/70 billion parameters and Falcon with 180 billion parameters, Gaudi 3 exhibits exceptional inference throughput and efficiency. S0f9e74af-2794-45c6-aedb-1a077b9d5cac.png Gaudi 3 offers various flexible form factors, including OAM-compatible mezzanine cards, universal baseboards, and PCIe expansion cards, catering to diverse application needs.

With open, community-based software and industry-standard Ethernet networking, Gaudi 3 can seamlessly scale from single-node deployments to clusters, superclusters, and hyperscale clusters comprising thousands of nodes, supporting large-scale inference, fine-tuning, and training tasks.

Featuring high performance, cost-effectiveness, energy efficiency, and rapid deployment capabilities, Gaudi 3 AI accelerators fully address the requirements of AI applications such as complexity, cost efficiency, fragmentation, data reliability, and compliance.

Gaudi 3 is set to ship to OEM vendors in the second quarter of 2024, including Dell, HPE, Lenovo, and Supermicro.

Currently, industry clients and partners of Intel Gaudi accelerators include NAVER, Bosch, IBM, Ola/Krutrim, NielsenIQ, Seekr, IFF, CtrlS Group, Bharti Airtel, Landing AI, Roboflow, Infosys, and more. S2b24a93f-7294-491f-889b-e94103bad159.png

S063a881b-57b3-4fd3-b16a-063550a00cc9.png

S64a64c7b-8ff1-4872-948f-66336f1c43f5.png In addition, Intel has announced partnerships with Anyscale, DataStax, Domino, Hugging Face, KX Systems, MariaDB, MinIO, Qdrant, RedHat, Redis, SAP, SAS, VMware, Yellowbrick, Zilliz, and others to jointly create an open platform aimed at helping enterprises drive AI innovation.

The initiative aims to develop an open, multi-vendor AIGC (AI-driven Generalized Compute) system, leveraging Retrieval-Augmented Generation (RAG) technology to provide top-notch deployment convenience, performance, and value.

In the initial phase, Intel will utilize Xeon processors, Gaudi accelerators, to launch a reference implementation of the AIGC pipeline, release a technical conceptual framework, and continue to enhance the functionality of the Intel Tiber developer cloud platform infrastructure.