Inference AI Likely to Take Center Stage When Nvidia’s CEO Speaks on Monday

INTC

When Nvidia CEO Jensen Huang addresses attendees at the company’s developers conference on Monday, markets expect discussion of the company’s graphics processing units, as usual. But this year attention is likely to shift to inference - the phase of AI where models perform tasks for users - and how Nvidia intends to compete as customers and rivals pursue specialized chips for that work. Key developments include Nvidia’s $17 billion acquisition of Groq, investments in laser technology suppliers, and analyst expectations for an inference-oriented product unveiling.

Key Points

Nvidia’s CEO will likely discuss GPUs at the company’s developers conference on Monday, but investor focus is expected to shift toward inference AI and how Nvidia plans to compete in that space.
Nvidia completed a $17 billion acquisition of Groq in December and plans to show integration of Groq’s inference technology with its CUDA platform; analysts note Groq’s claimed advantages on latency and cost for inference workloads.
Nvidia has invested about $2 billion in laser makers Lumentum and Coherent to speed chip interconnects, while analysts will look for updates on memory supply pressures and the effects of the conflict in Iran on data center build-out.

Jensen Huang has frequently used Nvidia’s annual developers conference in California as a platform to display new processors that helped underpin the recent surge in artificial intelligence activity. On Monday he is again expected to highlight the company’s flagship graphics processing units, or GPUs, which have been central to the build-out of powerful data centers running AI models.

Alongside coverage of GPUs, investors and industry observers are likely to press Huang on Nvidia’s strategy for inference, or the part of AI that enables bots and models to perform tasks on behalf of humans. Inference workloads often run on different kinds of chips than the training tasks for which Nvidia’s GPUs are best known, and the shift in customer attention from training to inference has sharpened competitive pressures.

Nvidia’s processors remain critical components in the data centers that support AI models, allowing the company to capture a share of the large technology-sector capital spending on building these facilities. At the same time Nvidia faces competition from peer chipmakers such as Advanced Micro Devices and Intel, as well as from large technology firms including Alphabet’s Google, which are developing their own AI-optimized processors.

Compounding the rivalry is the potential for major Nvidia customers to internalize more of their compute stack. Companies like OpenAI and Meta Platforms have indicated they could produce their own inference processors, a development that would reduce their reliance on third-party chips. Because inference workloads may be better served by different architectures than those used for training, the rise of inference represents a material challenge for companies that have historically focused on GPUs.

Expectations around Monday’s event are shaped in part by Nvidia’s recent moves to broaden its footprint in inference. Analysts at Vital Knowledge wrote that the main deliverable many anticipate is a new inference-oriented chip that will incorporate intellectual property acquired through Nvidia’s transaction involving Groq. Nvidia paid $17 billion in December to acquire Groq, a startup that had specialized in enabling faster and cheaper inference work. Jensen Huang has said he will show how Groq’s technology can be connected to Nvidia’s CUDA software platform.

Market commentary has highlighted the performance claims around Groq’s approach. Analysts at Mizuho noted that Groq has demonstrated roughly 100 times lower latency at approximately 20% of the cost for AI inference, when compared with Nvidia’s traditional GPUs. Those results have driven attention to how Nvidia plans to integrate Groq’s capabilities into its broader product mix.

In parallel with the Groq deal, Nvidia has placed roughly $2 billion of investments into Lumentum and Coherent, two manufacturers of lasers. These optical components can use beams of light to shuttle data rapidly between chips and thus could accelerate communication among processors inside data centers. While such lasers have the potential to improve interconnect speeds, they are not yet produced at volumes comparable to Nvidia’s mass-market processors.

Bank of America Securities analysts expect Nvidia to expand its AI product portfolio at the conference and said they will be watching for any commentary about supply-side pressures. Specifically, they flagged interest in updates regarding an AI-driven memory chip shortage and the possible effects of the ongoing conflict in Iran on supply chains, energy costs, and the worldwide construction of data centers.

The coming remarks from Nvidia’s CEO will therefore be scrutinized for details on how the company intends to deploy its substantial profits to pursue opportunities in inference, integrate recently acquired technologies, and mitigate risks to supply and build-out of AI infrastructure. Given Nvidia’s rank as the world’s largest publicly traded firm by market capitalization, the company’s approach to inference could influence investment decisions across hardware makers, cloud providers, and the operators building AI data centers.

Context and expectations

Huang is expected to speak about GPUs and to address inference-focused strategy on Monday.
Nvidia acquired Groq for $17 billion in December and plans to demonstrate Groq technology connected to CUDA.
Nvidia has invested about $2 billion in laser manufacturers Lumentum and Coherent to improve chip-to-chip communications.

What analysts are watching

Vital Knowledge sees an unveiling of a new inference chip incorporating Groq IP.
Mizuho highlighted claims that Groq can offer much lower latency at a fraction of the cost versus traditional GPUs for inference.
BofA wants updates on potential memory chip shortages and geopolitical impacts on supply chains, energy costs, and data center expansion.

Risks

Inference workloads may run more efficiently on chip types different from Nvidia’s GPUs, posing competitive risk to Nvidia and affecting the semiconductor and cloud infrastructure sectors.
Major customers like OpenAI and Meta Platforms could develop their own inference processors, reducing demand for third-party chips and impacting the broader chipmaking and cloud services markets.
Supply-side pressures, including an AI-driven memory chip crunch and geopolitical risks tied to the war in Iran, could disrupt supply chains, raise energy costs, and slow global data center expansion, affecting hardware suppliers and data center operators.

Menu

Key Points

Risks

More from Stock Markets