SAN FRANCISCO, March 13 - Amazon.com and Cerebras Systems said on Friday they have agreed to integrate their respective AI chips into a joint service that aims to accelerate inference workloads used by chatbots, coding assistants and other generative AI applications.
Under the arrangement, Cerebras accelerators will be deployed inside Amazon Web Services (AWS) data centers and connected to Amazon’s custom Trainium3 processors via Amazon’s own networking technology. The two firms said the pairing is intended to split inference into two sequential steps: a "prefill" phase that converts a user’s natural language request into the tokenized input a model requires, and a "decode" phase that produces the model’s output. Amazon’s Trainium3 chips will be assigned prefill tasks while Cerebras hardware will undertake decoding, a structure Cerebras CEO Andrew Feldman described as a "divide and conquer strategy."
Cerebras, which is valued at $23.1 billion, has developed an AI architecture that the company says differs materially from the high-bandwidth memory-dependent designs used in some market-leading chips. Earlier this year, Cerebras signed a $10 billion deal to supply chips to OpenAI, the developer of ChatGPT.
Neither company disclosed the financial size of the AWS-Cerebras agreement. Andrew Feldman emphasized accessibility, saying every customer - from independent developers to major banks - is on AWS and that the partnership will make access to Cerebras technology "easy as a click."
Amazon framed the collaboration as a way to offload complementary elements of inference to the chip best suited for each phase. The company said its Trainium3 program is on the verge of running production workloads and that it expects the combined offering to deliver favorable price-performance compared with merchant GPUs. Amazon also indicated confidence that Trainium3 - and a future Trainium4 - will lead in price-performance versus merchant graphics processing units.
The AWS service built with Cerebras is scheduled to come online in the second half of this year. Amazon acknowledged it could not yet draw a detailed comparison between its forthcoming service and a rival pairing that analysts expect Nvidia to disclose next week, which is anticipated to involve combining Nvidia’s GPUs with accelerators from Groq, a startup that Nvidia reportedly acquired for $17 billion in late December.
In a statement, Amazon noted the timeline for the Nvidia-Groq pairing remains unclear while Trainium3 is only months away from production workloads. The company reiterated its expectation that its chips will offer strong price-performance metrics relative to merchant GPUs.
The partnership focuses specifically on inference - the live execution of models to respond to user inputs - rather than model training. By assigning prefill operations to Trainium3 and decode operations to Cerebras hardware, the two companies are positioning the offering to optimize how cost and compute are allocated across the stages of serving AI-driven applications.
Both sides declined to provide additional commercial details about the terms of the agreement.
ProPicks AI evaluation
Should you be buying AMZN right now? ProPicks AI evaluates AMZN alongside thousands of other companies every month using 100+ financial metrics. Using powerful AI to generate exciting stock ideas, it looks beyond popularity to assess fundamentals, momentum, and valuation. The AI has no bias - it simply identifies which stocks offer the best risk-reward based on current data with notable past winners that include Super Micro Computer (+185%) and AppLovin (+157%). Want to know if AMZN is currently featured in any ProPicks AI strategies, or if there are better opportunities in the same space?