30 Apr 2026

AWS Trainium: How Amazon Built Their Own AI Chips | Researcher Conversations at GTC

SemiAnalysis

AWS is expanding its cloud infrastructure by adding one million GPUs this calendar year, bringing its total footprint to three million units to meet surging generative AI demand. This expansion leverages a 15-year partnership with NVIDIA, including the upcoming deployment of Rubin-generation systems. Beyond third-party hardware, AWS is prioritizing its custom silicon, Trainium 3, which offers a 30-40% price-performance advantage over alternatives and is slated for a million-chip deployment to support major AI labs like OpenAI and Anthropic. To address production-scale challenges such as data pipelines and cost control, AWS has partnered with Cerebras to implement disaggregated prefill and decoding architectures. These infrastructure investments aim to lower the dollar-per-token cost for inference and provide a broader selection of hardware platforms, ensuring capacity availability remains a non-issue for scaling startups and established AI researchers alike.

Outlines

Continue

Preview

How to Get Rich: Every EpisodeNaval

AWS Trainium: How Amazon Built Their Own AI Chips | Researcher Conversations at GTC

SemiAnalysis

AWS Expands NVIDIA Fleet to 3 Million GPUs to Meet Production Demands

Trainium 3 and Disaggregated Inference Architecture Reduce AI Costs

AWS Trainium: How Amazon Built Their Own AI Chips | Researcher Conversations at GTC

SemiAnalysis

00:07AWS Expands NVIDIA Fleet to 3 Million GPUs to Meet Production Demands

AWS Expands NVIDIA Fleet to 3 Million GPUs to Meet Production Demands

03:15Trainium 3 and Disaggregated Inference Architecture Reduce AI Costs

Trainium 3 and Disaggregated Inference Architecture Reduce AI Costs