Home > News > Hardware

The Secret Behind AMD Zen5's 40% Surge in Performance: Exclusive Upgrade to AVX-512 Instruction Set

Shang Fang Wen Q Wed, Apr 10 2024 09:22 AM EST

April 8th - Recent reports suggest that the single-core theoretical performance of the AMD Zen5 architecture could skyrocket by over 40% compared to Zen4, a claim that seems almost unbelievable. However, according to the latest insights from MLID, the secret behind this leap likely lies in the AVX-512 instruction set.

Originally an exclusive trump card for Intel, AVX-512 instruction set support was introduced with the AMD Zen4 architecture, spanning across both consumer-grade Ryzen and data-center-grade EPYC processors. Ironically, due to Intel's adoption of a big.LITTLE core design, the next generations like Arrow Lake and Lunar Lake are highly unlikely to support AVX-512 (or hyper-threading), thus handing AMD an exclusive advantage. s_97fb18d529bc47e58344d065de13b841.jpg The AVX-512 instruction set in the Zen4 architecture is executed through the combination of two 256-bit FPU floating-point units, offering a degree of flexibility and lower power consumption, albeit without reaching peak performance.

In the upcoming Zen5 architecture, a 512-bit FPU unit will be introduced, capable of directly executing AVX-512 instructions, thereby enhancing performance. Additionally, it will efficiently handle instructions like VNNI, thereby bolstering AI capabilities.

To support these advancements, Zen5 architecture will see upgrades in various areas to ensure an ample supply of data and instructions to the FPU unit. For instance, the first-level cache DTLB will be enlarged, with the first-level data cache capacity increasing from 32KB to 48KB. Moreover, load-store queues will be widened, and the latency of FPU MADD will be reduced by one clock cycle, among other enhancements.

Furthermore, the integer execution pipeline in the Zen5 architecture will expand from 8 to 10 stages. However, the second-level cache capacity will remain unchanged, with each core still at 1MB. s_2d81ca6d9fb4405681fe79bb4f256efd.png