Jensen Huang has frequently used Nvidia’s annual developers conference in California as a platform to display new processors that helped underpin the recent surge in artificial intelligence activity. On Monday he is again expected to highlight the company’s flagship graphics processing units, or GPUs, which have been central to the build-out of powerful data centers running AI models.
Alongside coverage of GPUs, investors and industry observers are likely to press Huang on Nvidia’s strategy for inference, or the part of AI that enables bots and models to perform tasks on behalf of humans. Inference workloads often run on different kinds of chips than the training tasks for which Nvidia’s GPUs are best known, and the shift in customer attention from training to inference has sharpened competitive pressures.
Nvidia’s processors remain critical components in the data centers that support AI models, allowing the company to capture a share of the large technology-sector capital spending on building these facilities. At the same time Nvidia faces competition from peer chipmakers such as Advanced Micro Devices and Intel, as well as from large technology firms including Alphabet’s Google, which are developing their own AI-optimized processors.
Compounding the rivalry is the potential for major Nvidia customers to internalize more of their compute stack. Companies like OpenAI and Meta Platforms have indicated they could produce their own inference processors, a development that would reduce their reliance on third-party chips. Because inference workloads may be better served by different architectures than those used for training, the rise of inference represents a material challenge for companies that have historically focused on GPUs.
Expectations around Monday’s event are shaped in part by Nvidia’s recent moves to broaden its footprint in inference. Analysts at Vital Knowledge wrote that the main deliverable many anticipate is a new inference-oriented chip that will incorporate intellectual property acquired through Nvidia’s transaction involving Groq. Nvidia paid $17 billion in December to acquire Groq, a startup that had specialized in enabling faster and cheaper inference work. Jensen Huang has said he will show how Groq’s technology can be connected to Nvidia’s CUDA software platform.
Market commentary has highlighted the performance claims around Groq’s approach. Analysts at Mizuho noted that Groq has demonstrated roughly 100 times lower latency at approximately 20% of the cost for AI inference, when compared with Nvidia’s traditional GPUs. Those results have driven attention to how Nvidia plans to integrate Groq’s capabilities into its broader product mix.
In parallel with the Groq deal, Nvidia has placed roughly $2 billion of investments into Lumentum and Coherent, two manufacturers of lasers. These optical components can use beams of light to shuttle data rapidly between chips and thus could accelerate communication among processors inside data centers. While such lasers have the potential to improve interconnect speeds, they are not yet produced at volumes comparable to Nvidia’s mass-market processors.
Bank of America Securities analysts expect Nvidia to expand its AI product portfolio at the conference and said they will be watching for any commentary about supply-side pressures. Specifically, they flagged interest in updates regarding an AI-driven memory chip shortage and the possible effects of the ongoing conflict in Iran on supply chains, energy costs, and the worldwide construction of data centers.
The coming remarks from Nvidia’s CEO will therefore be scrutinized for details on how the company intends to deploy its substantial profits to pursue opportunities in inference, integrate recently acquired technologies, and mitigate risks to supply and build-out of AI infrastructure. Given Nvidia’s rank as the world’s largest publicly traded firm by market capitalization, the company’s approach to inference could influence investment decisions across hardware makers, cloud providers, and the operators building AI data centers.
Context and expectations
- Huang is expected to speak about GPUs and to address inference-focused strategy on Monday.
- Nvidia acquired Groq for $17 billion in December and plans to demonstrate Groq technology connected to CUDA.
- Nvidia has invested about $2 billion in laser manufacturers Lumentum and Coherent to improve chip-to-chip communications.
What analysts are watching
- Vital Knowledge sees an unveiling of a new inference chip incorporating Groq IP.
- Mizuho highlighted claims that Groq can offer much lower latency at a fraction of the cost versus traditional GPUs for inference.
- BofA wants updates on potential memory chip shortages and geopolitical impacts on supply chains, energy costs, and data center expansion.