Amazon and Cerebras Pair Chips to Power Faster AI Inference on AWS

AMZN

Amazon.com and Cerebras Systems announced a partnership to integrate Cerebras AI accelerators into Amazon Web Services data centers and pair them with Amazon’s Trainium3 chips. The collaboration is designed to speed up inference - the runtime step for chatbots, coding assistants and other AI services - by dividing the work into prefill and decode stages. Cerebras, valued at $23.1 billion and earlier in the year a recipient of a $10 billion supply agreement with OpenAI, will handle decoding while Trainium3 will manage prefill operations. Both companies declined to disclose the commercial terms of their agreement.

Key Points

Amazon and Cerebras will integrate Cerebras accelerators into AWS data centers and connect them with Amazon Trainium3 chips using Amazon’s networking technology.
The service splits inference into a prefill stage handled by Trainium3 and a decode stage handled by Cerebras hardware to speed runtime AI responses.
Cerebras is valued at $23.1 billion and earlier this year signed a $10 billion supply agreement with OpenAI; AWS’s combined service is slated to launch in the second half of the year.

SAN FRANCISCO, March 13 - Amazon.com and Cerebras Systems said on Friday they have agreed to integrate their respective AI chips into a joint service that aims to accelerate inference workloads used by chatbots, coding assistants and other generative AI applications.

Under the arrangement, Cerebras accelerators will be deployed inside Amazon Web Services (AWS) data centers and connected to Amazon’s custom Trainium3 processors via Amazon’s own networking technology. The two firms said the pairing is intended to split inference into two sequential steps: a "prefill" phase that converts a user’s natural language request into the tokenized input a model requires, and a "decode" phase that produces the model’s output. Amazon’s Trainium3 chips will be assigned prefill tasks while Cerebras hardware will undertake decoding, a structure Cerebras CEO Andrew Feldman described as a "divide and conquer strategy."

Cerebras, which is valued at $23.1 billion, has developed an AI architecture that the company says differs materially from the high-bandwidth memory-dependent designs used in some market-leading chips. Earlier this year, Cerebras signed a $10 billion deal to supply chips to OpenAI, the developer of ChatGPT.

Neither company disclosed the financial size of the AWS-Cerebras agreement. Andrew Feldman emphasized accessibility, saying every customer - from independent developers to major banks - is on AWS and that the partnership will make access to Cerebras technology "easy as a click."

Amazon framed the collaboration as a way to offload complementary elements of inference to the chip best suited for each phase. The company said its Trainium3 program is on the verge of running production workloads and that it expects the combined offering to deliver favorable price-performance compared with merchant GPUs. Amazon also indicated confidence that Trainium3 - and a future Trainium4 - will lead in price-performance versus merchant graphics processing units.

The AWS service built with Cerebras is scheduled to come online in the second half of this year. Amazon acknowledged it could not yet draw a detailed comparison between its forthcoming service and a rival pairing that analysts expect Nvidia to disclose next week, which is anticipated to involve combining Nvidia’s GPUs with accelerators from Groq, a startup that Nvidia reportedly acquired for $17 billion in late December.

In a statement, Amazon noted the timeline for the Nvidia-Groq pairing remains unclear while Trainium3 is only months away from production workloads. The company reiterated its expectation that its chips will offer strong price-performance metrics relative to merchant GPUs.

The partnership focuses specifically on inference - the live execution of models to respond to user inputs - rather than model training. By assigning prefill operations to Trainium3 and decode operations to Cerebras hardware, the two companies are positioning the offering to optimize how cost and compute are allocated across the stages of serving AI-driven applications.

Both sides declined to provide additional commercial details about the terms of the agreement.

ProPicks AI evaluation

Should you be buying AMZN right now? ProPicks AI evaluates AMZN alongside thousands of other companies every month using 100+ financial metrics. Using powerful AI to generate exciting stock ideas, it looks beyond popularity to assess fundamentals, momentum, and valuation. The AI has no bias - it simply identifies which stocks offer the best risk-reward based on current data with notable past winners that include Super Micro Computer (+185%) and AppLovin (+157%). Want to know if AMZN is currently featured in any ProPicks AI strategies, or if there are better opportunities in the same space?

Risks

Commercial terms of the Amazon-Cerebras deal were not disclosed, creating uncertainty about cost structure and contractual scale - this impacts cloud infrastructure and enterprise AI procurement.
Amazon said it could not yet make a detailed comparison with the expected Nvidia-Groq pairing, leaving competitive positioning and relative value unresolved - this affects chipmakers and cloud service competition.
The partnership targets inference only; performance and integration in production workloads remain to be demonstrated when Trainium3 enters production - this introduces execution risk for AI service providers and enterprise adopters.

Menu

Key Points

Risks

More from Stock Markets