Home > News > Internet

The Rise of CPUs in AI: Evolution of Recommendation Systems

cici Wed, Apr 24 2024 07:49 PM EST

In today's era of abundant information and data, artificial intelligence is impacting human life in various ways. Among these, AI recommendation systems, as one of the key companion technologies of the Internet age, are helping humans sift through the myriad of information, enabling more precise resource allocation and bringing order to chaos.

Recommendation Systems: Key Engine of Digital Marketing

  • When you visit a tourist city and are clueless about local delicacies, AI recommendation systems on local lifestyle platforms on your smartphone will list all the local delicacies for you.
  • When you shop online and are unsure about the latest fashion trends, AI recommendation systems on e-commerce platforms will suggest season-appropriate clothes tailored to your taste.

For businesses, AI recommendation systems are even more critical. In the vast world of commerce, AI recommendation systems act as a chain of order, enabling millions of businesses to find users worldwide who need their products the most.

Secretary-General Zhong Junhao of the Shanghai Artificial Intelligence Industry Association pointed out, "With the rapid development of AI technology and advancements in software and hardware, we are entering a new era where AI technology is widely applied and profoundly affecting various industries. Behind many commercial scenarios such as e-commerce and personalized advertising lies the reliance on AI recommendation systems, which have become one of the most mature applications of AI technology in the business domain."

Digital marketing is one major scenario utilizing AI recommendation systems. As a leading commercial digital marketing platform in China, Alibaba's Alimama provides end-to-end marketing solutions for enterprises using AI recommendation systems.

Alimama's mission is to "make marketing easier," and through its self-developed digital marketing platform, it solves a series of marketing and advertising placement issues for businesses and brands in various industries every year. By precisely targeting marketing, it helps businesses promote their products to consumers who need them the most, thus enabling enterprises to realize their inherent value.

To achieve more precise matching between a vast number of products and consumers, Alimama's AI recommendation algorithms and models have been continuously upgraded and iterated over the past few years. Meanwhile, the increasing complexity of AI models has also raised the demand for hardware computing power.

"Online shopping is already very common. In fact, every time a user browses products online, it involves billions of floating-point operations," explained Liu Zhengyu, a software engineer at Alimama.

This has led to a growing contradiction between the increasing demand for computing power and the limited supply of computing power, which has become a major pain point for enterprises like Alimama that conduct digital marketing businesses based on AI technology.

The most direct way to address this pain point is to increase AI computing power and optimize AI algorithms. Especially, the improvement of AI computing power is the most crucial guarantee for the continuous upgrading and iteration of digital marketing businesses.

Challenge and Opportunity: The Best Fit Is the Best

The emergence of large models has raised the demand for GPUs, making GPUs seem like the only choice for AI technology in terms of computing power overnight.

However, in practical commercial applications such as digital marketing, GPUs are not only not the only choice but may not even be the optimal solution.

In fact, the strong demand for computing power in AI technology mainly comes from AI training and AI inference. To transform AI technology into productivity for various industries and help them improve production efficiency, AI inference is the most critical aspect.

Zhong Junhao also elaborated, "In the year when large models are deeply penetrating industries, more and more industries are paying attention to AI inference. How to maximize the performance of CPUs, accelerate AI inference, and promote industrial landing has become a key issue."

Some hardware manufacturers have already made significant optimizations for traditional AI applications such as recommendation systems, speech recognition, image recognition, and gene sequencing on CPUs. Especially when executing AI inference tasks, optimized large models can already achieve efficient execution on CPUs.

Li Yadong, General Manager of the Intel Data Center and Artificial Intelligence Group's Xeon Eco-Empowerment Division (China), pointed out, "When the model is large and involves computing across heterogeneous platforms, using CPUs can be faster and more efficient."

In December 2023, Intel officially launched the fifth-generation Intel® Xeon® Scalable Processor in China, which improves hardware computing power in many aspects, including frequency, power consumption, LLC cache, memory bandwidth, and latency.

Most importantly, it features Intel® Advanced Matrix Extensions (Intel® AMX technology), specifically optimized for the most common matrix multiplication operations in deep learning models, supporting common data types such as BF16 (training/inference) and INT8 (inference).

Intel® AMX is located on each CPU core and near the system memory, reducing data transfer latency, improving data transfer bandwidth, and synchronously reducing the complexity of actual usage.

In fact, in the hardware computing power challenges faced by current AI recommendation systems, CPUs have become the core computing power to meet AI inference computing requirements.

According to Liu Zhengyu, "Alimama chose the fifth-generation Intel® Xeon® Scalable Processor as the computing platform. After optimization using Intel® AMX and AVX-512, the performance of the advertisement recommendation model has significantly improved compared to the fourth-generation Intel® Xeon® Scalable Processor, achieving a 1.52x increase in throughput while meeting SLA requirements."

Based on this, Alimama continuously improves computing power and optimizes algorithms, making the entire marketing chain smoother, more intelligent, and more efficient. In addition to hardware innovations, Intel is also making continuous efforts in software to ensure that existing AI frameworks and applications fully leverage the hardware potential.

Intel not only consistently contributes to mainstream open-source frameworks like PyTorch and TensorFlow but also provides various optimization plugins for CPU platforms, such as IPEX (Intel® Extension for PyTorch) and ITEX (Intel® Extension for TensorFlow), along with optimization tools like xFT (xFasterTransformer) and the OpenVINO™ toolkit.

"The most suitable is the best. What we need now is not infinite computing power but super warriors with sufficient computing power," further explained by Liu Zhengyu.

Similarly, CPU platforms, widely deployed and easily accessible, are conducive to application and optimization, capable of handling both general-purpose computing and inference acceleration without introducing the complexities of heterogeneous computing. This naturally leads to efficient application performance, faster deployment, and stronger cost competitiveness.

Taking the example of digital marketing applications like Alibaba's Alimama, whether it's computationally intensive AI operations like matrix multiplication or memory-intensive AI operations like data querying, CPU involvement is indispensable.

Even in CPU-GPU cooperative application scenarios, the computational power of GPUs as coprocessors heavily relies on CPU processing speed.

Accelerating AI deployment with CPUs holds great promise for the future, offering significant potential.

If 2023 was a year of breakthroughs in large-scale model technologies, then 2024 is the pivotal year for the deep integration of large-scale models into industrial applications. Whether it's large-scale models or traditional AI technologies, achieving "fast, good, and economical" results is crucial for successful implementation.

CPUs ensure the stable operation of the entire system, efficient communication and collaboration among components, and ultimately facilitate smooth task execution.

In addition to popular AI inference and training tasks, an AI pipeline also includes stages like data preprocessing and post-processing, which may require the general-purpose processing capabilities of CPUs. In these stages, the versatility and flexibility of CPUs enable them to adapt to various computing scenarios and diverse application needs.

The 5th Generation Intel® Xeon® Scalable processors fully consider these requirements, incorporating features like Intel® Data Streaming Accelerator (Intel® DSA) for data storage and transmission acceleration; Intel® Integrated AI Accelerator (Intel® IAA) for accelerating databases and data analytics; and Intel® QuickAssist Technology (Intel® QAT) for accelerating data compression, symmetric and asymmetric data encryption and decryption, improving CPU efficiency and overall system performance.

Li Yadong also pointed out, "From the perspective of long-term enterprise development, the performance of CPUs in stability and security is trustworthy, which is crucial for protecting enterprise data and customer privacy. The built-in Intel® SGX and TDX in the 5th Generation Intel® Xeon® Scalable processors can provide enterprises with stronger and easier-to-use application isolation capabilities and isolation and confidentiality at the virtual machine level, offering a simpler path for existing applications to migrate to a trusted execution environment."

In the future, the Intel data center product portfolio is expected to cover general computing and AI acceleration, achieving AI "end-to-end" acceleration from data preprocessing to model training and optimization, deployment, and inference.

CPUs are not only old partners but also new variables. With the continuous improvement of performance in various aspects of the new generation of CPUs, CPUs are becoming the heart that continuously provides momentum for the intelligent transformation of thousands of industries.

As Zhong Junhao said, "The continuously innovating and evolving CPU becomes the best gift that generation after generation of scientists leave to the new era in the cycle of new technology."