Home > News > It

Huang Renxun's Earnings Call Transcript: So Many Customer Demands for GPUs, Our Pressure Is Overwhelming

Sun, May 26 2024 07:28 AM EST

On May 23rd, local time in the United States, Nvidia released its financial report for the first quarter of the 2025 fiscal year ending April 28, 2024. The report revealed that Nvidia's first-quarter revenue reached $26 billion, a 262% year-on-year increase, surpassing analysts' average expectation of $24.65 billion; net profit was $14.81 billion, up 628% year-on-year; earnings per share were $5.98, exceeding analysts' average expectation of $5.59. Due to the revenue and profit exceeding market expectations, Nvidia's stock price surged after the financial report, surpassing the $1000 mark for the first time.

Following the financial report release, Nvidia's CEO Jensen Huang and Executive Vice President and CFO Colette Kress, along with other executives, attended the subsequent earnings conference call to interpret the key points of the report and answer analysts' questions.

Huang Renxun's Financial Report Interpretation

The entire industry is currently undergoing significant transformation. Before we move on to the Q&A session, I'd like to discuss the importance of this transformation. The new industrial revolution has begun.

Many companies and countries are collaborating with Nvidia to transform traditional data centers worth trillions of dollars into accelerated computing, focusing on building new types of data centers, namely AI Factories, to produce unprecedented goods—artificial intelligence.

AI will bring revolutionary efficiency improvements to almost every industry, helping businesses increase revenue while enhancing cost-effectiveness and energy efficiency. Cloud service providers are at the forefront of generative AI. Leveraging Nvidia's advanced technology, these cloud providers accelerate workload processing, save costs, and reduce power consumption. The tokens generated by Nvidia's Hopper platform bring revenue to their AI services, while Nvidia's cloud instances attract tenants from our vast developer ecosystem.

Due to the rapid growth in training and inference demands for generative AI on the Hopper platform, our data center business is also experiencing robust growth. Training scales continue to expand as models learn how to handle multimodal content such as text, speech, images, videos, and 3D, and learn how to perform inference and planning.

Our inference workload is significantly increasing. With the development of generative AI, inference now involves rapidly generating tokens at a massive scale, which has become extremely complex. Generative AI is driving the transformation of computing platforms from the ground up to full-stack, fundamentally changing every interaction we have with computers. We are transitioning from today's information retrieval models to a computing model that generates answers and skills. AI will gradually understand context and our true intentions, possessing stronger capabilities in knowledge, reasoning, planning, and task execution.

We are fundamentally reforming how computers work and function, shifting from general-purpose CPUs to GPU-accelerated computing, from instruction-driven software to models that understand intent, from simple information retrieval to executing complex skills. On an industrial level, we are transitioning from traditional software production to token generation, the manufacturing of digital intelligence.

Token generation will continue to drive the long-term development of AI Factories. In addition to cloud service providers, generative AI has expanded to consumer internet companies, various enterprises, Sovereign AI, the automotive and healthcare sectors, nurturing multiple vertical markets worth billions of dollars.

The Blackwell platform is fully operational, laying a solid foundation for handling generative AI at the trillion-parameter level. The combination of Grace CPU, Blackwell GPU, NVLink, Quantum and Spectrum, along with high-speed interconnect technologies, and complemented by our rich software and partner ecosystem, enables us to provide customers with unprecedented, comprehensive AI Factory solutions.

Spectrum-X has opened up a new market for us, allowing us to introduce large-scale AI into Ethernet-only data centers. NVIDIA NIM, our new software product, supported by our extensive ecosystem of partners, can run enterprise-optimized generative AI in various environments from the cloud to on-premises data centers to RTX AI personal computers. From Blackwell to Spectrum-X to NIM, we are prepared for the upcoming wave of growth.

Below are excerpts from the analyst Q&A session:

Bernstein analyst Stacy Rasgon: I'd like to delve deeper into the situation with Blackwell. Now that it's fully operational, does this mean the product has moved beyond the sampling stage? If so, how will this impact shipping and delivery times? When Blackwell truly reaches customers, what does it mean for them?

Huang Renxun: We will begin shipping. In fact, we have already started production for some time. However, our production shipments will commence in the second quarter and accelerate in the third quarter, allowing customers to establish data centers in the fourth quarter.

Rasgon: Will Blackwell generate revenue this year?

Huang Renxun: Yes, we will see significant revenue from Blackwell this year.

UBS analyst Timothy Arcuri: I'd like to compare the deployment differences between Blackwell and Hopper, especially considering system characteristics and the significant demand for GB. How does this deployment differ from Hopper? I ask because we have never used large-scale liquid cooling technology before, and there are some engineering challenges at the node level and within data centers. Will this complexity extend the transition period? How do you view the progress of this process? Huang Renxun: Yes, Blackwell comes in various configurations. It's a platform, not just a GPU. This platform supports air cooling, liquid cooling, x86 and Grace, InfiniBand, and now Spectrum-X and the very large NVLink domain I showcased at GTC. Therefore, for some customers, they will gradually transition on their existing data center infrastructure already installed with Hopper. They can easily switch from H100 to H200 and then to B100. Hence, the Blackwell system was designed with backward compatibility in mind, with thorough considerations in power and mechanics.

Running software stacks on Hopper will also perform well on Blackwell. We have been continuously "injecting vitality" into the entire ecosystem to prepare them for liquid cooling. We have engaged in extensive discussions with companies in the Blackwell ecosystem, including cloud service providers, data centers, ODMs, system manufacturers, our supply chain, as well as cooling technology and data center supply chains. They will not be surprised by the arrival of Blackwell and the capabilities we aim to provide through Grace and Blackwell 200.

Vivek Arya, Bank of America Securities Analyst: Thank you for addressing my question, Renxun. I'd like to understand how you ensure high utilization of your products and prevent premature procurement or hoarding due to supply constraints, competition, or other factors. What mechanisms in your system can reassure us that revenue will stay in sync with very strong shipment growth?

Huang Renxun: This is a crucial point, and I will answer your question directly. Currently, there is an incredible global demand for GPUs in data centers. We are striving every day to keep up with this demand. Applications like ChatGPT and GPT-4 are moving towards multimodal processing, and the work being done by Gemini, Anthropic, and all cloud service providers (CSPs) is consuming all available GPU resources in the market. Additionally, there are about 15,000 to 20,000 generative AI startups involved in multimedia, digital characters, various design tools, and productivity applications, including companies in digital biology and autonomous driving video training fields, all actively expanding and increasing their demand for GPU resources. We are essentially racing against time. Customers are putting immense pressure on us, urgently wanting us to deliver and deploy systems as soon as possible.

Furthermore, we also face challenges from sovereign AI, aimed at utilizing a country's natural resource data to train regional models. The deployment of these systems also faces significant pressure. Therefore, the current demand is very high, far exceeding our supply capabilities.

Looking ahead, we are fundamentally transforming how computers operate. This is a major platform shift, likened to other historical platform transformations, but time will prove that this shift will be more profound than any before. Modern computers are no longer just instruction-driven but are moving towards understanding user intent. They can grasp not only how we interact with them but also our needs and intentions, with the ability for iterative reasoning to formulate and execute solutions. Every aspect of computers is changing, transitioning from simple information retrieval to generating contextually intelligent answers. This will completely alter the global computing architecture, even revolutionizing PC computing platforms. This is just the beginning, and in the future, we will continue to explore in the lab and collaborate with startups, large enterprises, and developers globally to drive this transformation, with extraordinary impacts.

Joseph Moore, Morgan Stanley Analyst: I understand how strong the demand is as you mentioned. Both your H200 and Blackwell have significant demand. So, when transitioning to Hopper and H100 products, what market response do you anticipate? Will people wait for these new products, expecting outstanding performance? Or do you believe the demand for H100 alone is sufficient to sustain growth?

Huang Renxun: We have observed a continuous growth in demand for Hopper this quarter. We anticipate that as we transition to H200 and Blackwell now, the situation of demand outstripping supply may persist for some time. Everyone is eager to get their infrastructure online as soon as possible, so they can save money and start making money quickly.

Toshiya Hari, Goldman Sachs Analyst: I'd like to ask about competition. I know many of your cloud customers have announced new or updated internal programs, synchronized with your cooperation. In the medium to long term, to what extent do you consider them as competitors? In your view, are they primarily addressing internal workloads, or could their role be more extensive?

Huang Renxun: Our uniqueness is reflected in several aspects. Firstly, NVIDIA's accelerated computing architecture enables customers to handle every step in their workflow, from processing unstructured data for training preparation to structured data processing, SQL-like data frame processing, and training and inference. As mentioned earlier, inference has undergone a fundamental transformation, now transitioning to generative mode. It's not just about identifying a cat—which is quite challenging in itself—but generating every pixel of the cat. Therefore, the generation process is a completely new processing architecture. This is also one of the reasons why TensorRT LLM is highly popular. We utilize the same chips, enhancing performance threefold through our architecture. This fully demonstrates the depth and strength of our architecture and software. Hence, from computer vision to image processing, from computer graphics to various forms of computation, you can leverage NVIDIA's technology. As the world faces rising computational costs and energy inflation, general-purpose computing has hit a bottleneck, and accelerated computing is indeed a sustainable way forward. Accelerated computing is key to saving computational costs and conserving energy. Therefore, the versatility of our platform brings the lowest Total Cost of Ownership (TCO) to our customers' data centers.

Secondly, we are present on every cloud platform. Therefore, for developers looking for a development platform, choosing NVIDIA is always an excellent choice. Whether on-premises or in the cloud, with computers of any size and shape, we are virtually everywhere. This is our second advantage.

The third advantage is closely related to the fact that we are building an AI factory. People are increasingly realizing that AI issues are not just about chips. Of course, everything starts with excellent chips, and we manufacture a large number of chips for our AI factory. However, AI is more of a systemic issue. In fact, AI is now a systemic issue, not just a large language model, but a complex system composed of multiple large language models working together. Therefore, NVIDIA builds this system to optimize all our chips to work together as a system, with software that can operate as a system and optimize throughout the system.

From a simple numerical perspective, if you have an infrastructure worth $5 billion, and you double its performance (which we often do), its value also increases to $10 billion. The cost of all these chips is not enough to pay for them. Therefore, its value is immense. That is why performance is crucial today. In an era where highest performance also means lowest cost, maintaining the infrastructure for all these chips is very expensive. It takes a lot of funds to build and operate data centers, including all related costs such as manpower, electricity, real estate, and more. Therefore, highest performance also ensures the lowest Total Cost of Ownership (TCO).

TD Cowen analyst Matt Ramsay: My entire career has been spent in the data center industry, but I have never seen a company launch new platforms as quickly as NVIDIA, and the leap in performance of your products is particularly significant: training performance has increased by 5 times, while inference performance has improved by 30 times, which is undoubtedly a remarkable achievement. However, it also presents an interesting challenge: the previous generation products that your customers have spent billions of dollars on may now seem inferior in competitiveness compared to your new products, with a depreciation cycle much shorter than expected. In the face of this situation, how do you view it? As you migrate to new generations of products like Blackwell, with a huge installed base, obviously no issues with software compatibility, but the performance of a large number of installed products will be far below that of the new generation products. I am very curious about this and look forward to hearing about the changes you have observed in this process.

Huang Renxun: Thank you very much for your question, I am glad to share my views. I would like to emphasize three points.

Firstly, whether at the initial stage of infrastructure construction (5%) or nearing completion (95%), your perception will be vastly different. Because currently only 5% is completed, you need to build quickly. When Blackwell products are launched, it will be a huge leap. Subsequently, as we continue to introduce new Blackwell products, we are on a pace of annual updates. We hope customers can clearly see our development blueprint, even though their projects are just beginning, they must continue to progress. Therefore, there will be a large number of new chips released, they need to be continuously built, and performance needs to be gradually improved to meet standards. This is a wise move. They need to profit immediately and save costs, time is crucial for them.

Let me give an example to illustrate the importance of time: why rapid deployment of data centers and shortening training time is so critical. Because the next company to reach new technological heights will announce a breakthrough artificial intelligence technology, while the subsequent company may only announce a slightly improved product, with an increase of only 0.3%. So, the question is, do you want to be a company that constantly breaks through, or just slightly ahead? That's why competition is so crucial in all technological races. You can see many companies competing in this field, having a technological edge is crucial, companies need to believe in this and be willing to build long-term on your platform because they know this platform will only get better. Therefore, leadership is very important, and training time is also crucial. Being able to complete training three months ahead means starting the project three months earlier, all of this is crucial.

That's why we are now actively deploying the Hopper system, as the next technical platform is coming. The first point you mentioned is great, this is how we can progress rapidly and develop quickly. We have all the necessary technology stacks. We actually build the entire data center, able to monitor, measure, and optimize everything. We know where the bottlenecks are, we are not making random guesses, we are not just showing pretty slides. We do want our slides to look good, but we provide systems that can run on a large scale. We know how they perform on a large scale because we are building them here. One almost miraculous thing we do is build the entire AI infrastructure here, then we break it down and integrate it into the customer's data center, whichever way they choose. But we know how it will perform, where the bottlenecks are, where we need to collaborate with them to optimize, where we need to help them improve the infrastructure for optimal performance. This in-depth understanding of the entire data center scale is the fundamental reason why we can differentiate ourselves from other competitors today. We build every chip from scratch, we know exactly how the entire system operates. Therefore, we are very clear on how it will perform and how to fully unleash its potential in each generation of products. Evercore ISI analyst Mark Lipacis: You have mentioned before that general-purpose computing ecosystems often dominate each computing era because these systems can achieve higher utilization when computing demands decrease by adapting to different workloads. This seems to be the motivation behind your efforts to establish a CUDA-based general-purpose GPU ecosystem for accelerating computing development. Now, considering that the primary workloads driving solution demand are neural network training and inference, it appears on the surface that there are a limited number of workloads. Some may argue that this is more suited for customized solutions. However, the key question is: Are general-purpose computing frameworks facing greater challenges, or do they possess enough flexibility and development speed to continue leveraging the historical advantages of general frameworks on these specific workloads?

Huang Renxun: While NVIDIA's accelerated computing is versatile, it is not considered a general computing platform. For example, we are not proficient in executing typical general computing tasks like spreadsheets. The control loops of operating system code may be adequate for general computing but may not be optimal for accelerated computing. Therefore, although I refer to our platform as versatile, it does not mean it is suitable for all scenarios. We can accelerate applications in many fields, which, despite their deep differences, mainly exhibit commonalities: they can be parallel processed and highly threaded. For instance, 5% of the code may occupy 99% of the runtime, which is the essence of accelerated computing. The versatility of our platform and the overall design of our system have enabled numerous startups to grow rapidly over the past decade. While these companies' architectures may be fragile, our system can provide stable support when facing emerging technologies like generative AI or fusion models. Especially when dealing with large language models that require continuous dialogue and contextual understanding, Grace's memory capabilities become crucial. Therefore, in the advancements of artificial intelligence, we emphasize the need not only to design solutions for individual models but also to provide systems that can broadly serve the entire field. We adhere to the fundamental principles of software, believing that software will continue to evolve, becoming more refined and powerful. We firmly believe that in the coming years, the scale of these models will increase by millions of times. The versatility of our platform plays a crucial role in this process; if too specialized, we might just be making FPGAs or ASICs, which are far from complete computing solutions.

Jefferies analyst Blayne Curtis: I am interested in the H20 product specifically tailored for the Chinese market that you have introduced. Given the current supply constraints, I am curious how you balance the demand for this product with the supply of other Hopper products. Could you elaborate on the outlook for the second half of the year, including the potential impact on sales and gross margins?

Huang Renxun: Regarding your mention of the H20 and the supply allocation between different Hopper products, I may not have fully understood your question. However, I want to emphasize that we respect every customer and strive to provide them with the best service. Indeed, our business in China has seen a decline compared to the past, mainly due to technology export restrictions and intensified competition in the Chinese market. But rest assured, we will continue to make every effort to provide the best service to customers in the Chinese market. As for the supply issue you mentioned, our comments also apply to the entire market, especially the supply situation of H200 and Blackwell towards the end of the year. Indeed, the demand for these two products is very strong.

Raymond James analyst Srini Pajjuri: I would like to learn more about the GB 200 system you mentioned earlier. It seems that there is significant demand for these systems in the market currently. Historically, NVIDIA has sold a large number of HGX and GPUs, while the system business has been relatively small. So, I am curious, why do you foresee such strong demand for systems now? Is it solely due to total cost of ownership (TCO) considerations, or are there other factors, such as architectural advantages?

Huang Renxun: In fact, the way we sell the GB 200 is similar to how we deconstruct our products. We break down all reasonable components and integrate them into computer manufacturers. This year, we will have 100 different configurations of Blackwell computer systems hitting the market, which is unprecedented. Even at its peak, Hopper had only half the configuration options, much fewer than the initial configurations. Blackwell offers more diversified choices. Therefore, you will see liquid-cooled versions, air-cooled versions, x86 versions, Grace versions, and so on. Our partners are also providing these diversified systems. Nothing really changes. Of course, the Blackwell platform greatly expands our product lineup. The integration of CPUs and a more compact computing density, along with liquid cooling, will save a significant amount of costs in power supply for data centers and improve energy efficiency. Hence, it is a better solution. It is more scalable, meaning we provide more components for data centers. In this process, everyone wins. Data centers will get higher-performing networks, from network switches to networks. Of course, we have NICs now, we have Ethernet, so we can bring NVIDIA AI to large-scale customers who only know how to operate Ethernet because they have such an ecosystem. Therefore, Blackwell is more scalable, and we provide more to customers. This generation of products is more diverse. Truist Securities analyst William Stein: Despite the availability of high-performance CPUs in the market for data center use, your Arm-based Grace CPU seems to offer some real advantages that make this technology worth delivering to customers. These advantages may be related to cost-effectiveness, power consumption, or the technical synergies between Grace, Hopper, and Blackwell. Can you explain if similar dynamics could also emerge on the client side? While there are already good solutions in the market, such as Intel and AMD providing excellent X86 products, Nvidia may have certain unique advantages in emerging artificial intelligence workloads that other companies may find hard to match.

Jensen Huang: You've raised some very good points. Indeed, our collaboration with x86 partners has been excellent for many applications, and together we have built many outstanding systems. However, Grace allows us to do things that current system configurations cannot. The memory system between Grace and Hopper is coherent and tightly connected. Viewing them as two separate chips seems somewhat inappropriate because they are more like a super chip. The bandwidth of the connection interface between them is in the range of terabytes per second, which is truly remarkable. Grace uses LPDDR memory, the first data center-grade low-power memory, allowing us to save a significant amount of power on each node. Additionally, now that we can create the architecture of the entire system, we can build a system with a very large NV connectivity domain, crucial for the inference of next-generation large language models.

So, you see the GB200 with a 72-node NVLink domain, which is like connecting 72 Blackwells into one massive GPU. Therefore, we need Grace and Blackwells tightly integrated to achieve this. There are architectural reasons, software programming reasons, and system-level reasons, all of which are necessary for building them. So, if we see similar opportunities, we will explore them. As you saw at yesterday's Microsoft event, CEO Satya Nadella announced the next-generation PC - Copilot+ PC, which runs very well on our RTX GPUs shipping on laptops. But it also supports ARM well. Thus, this opens the door for system innovation and even for PCs.

Cantor Fitzgerald analyst C.J. Muse: I think this is a more long-term question. I know Blackwell hasn't even been released yet, but clearly, investors always have foresight. In the increasingly competitive landscape of GPUs and custom ASICs, how do you view Nvidia's pace of innovation over the next decade? Over the past decade, Nvidia's launches of technologies like CUDA, Varsity, Precision, Grace, and Connectivity have been impressive. In the next 10 years, what challenges does Nvidia need to address? Perhaps more importantly, what are you willing to share with us today?

Jensen Huang: For the future, I can proudly tell you that after Blackwell, we will be introducing a brand-new chip. We are on a yearly cadence of updates, so you can expect us to roll out new networking technologies at a rapid pace. We recently launched Spectrum-X for Ethernet, but our plans for Ethernet go far beyond that, filled with passionate potential. We have a strong partner ecosystem, with Dell announcing the market launch of Spectrum-X. Our customers and partners will continue to introduce new products based on Nvidia's AI factory architecture. For companies seeking ultimate performance, we offer the InfiniBand computing structure, a network solution that has evolved over the years to become increasingly excellent. And as for the foundational Ethernet network, through Spectrum-X, we will empower it with stronger computing capabilities.

We are fully committed to the development of these three paths: the NVLink computing structure for a single computing domain, the InfiniBand computing structure, and the Ethernet network computing structure. We will advance these three directions at an astonishing pace. You will soon see new switches, new network cards, new features, and new software stacks running on these devices emerge. A range of new chips such as CPUs, GPUs, network cards, switches, and more are set to be released.

What's most exciting is that all these products will support CUDA and be compatible with our entire software stack. This means that if you invest in our software stack today, you never have to worry about it becoming outdated or falling behind because it will continuously evolve to be faster and more powerful. If you choose to adopt our architecture today, as it gradually enters more cloud and data centers, you will seamlessly continue running your operations.

I believe Nvidia's innovations will continually enhance our capabilities and reduce the total cost of ownership (TCO). We are confident that through Nvidia's architecture, we will lead this new era of computing, ushering in this new industrial revolution. We are no longer just producing software; we are mass-producing artificial intelligence Tokens.