Amazon deploys 1 million self-developed chips, predicting the next generation

November 12, 2025

39

A glance at Nvidia's incredible revenue and profits in its data center business makes it clear that the world's largest computing customer - a hyperscale data center operator, cloud service provider, and now the largest model provider - needs to change its performance curve to improve its profitability.

Amazon's Trainium AI

Amazon's Trainium AI accelerator appears to be used for both AI inference and training, hence the product name within the company's SageMaker and Bedrock AI technology stacks. This seems to suggest that AWS is shelving its related Inferentia series of inference accelerators in the GenAI era. (Perhaps they should just call it AInium?)

In a conference call with Wall Street analysts discussing Amazon and its Amazon Web Services cloud financial results, a central theme in the data center was the very positive progress of Trainium2, and the imminent deployment of the Trainium3 accelerator (previewed at re:Invent 2024 last December), developed in collaboration with model builder and close partner Anthropic.

We previewed the Trainium2 chip in December 2023 and now need an update on the actual specifications. We don't know much about Trainium3, only that it's manufactured using TSMC's 3nm process, boasts twice the performance of the existing Trainium2 chip, and offers 40% better energy efficiency (which we speculate means higher floating-point operations per watt).

Compare to other cloud service providers

Like other cloud service providers, Amazon is striving for a balance: on the one hand, leveraging its self-developed accelerators to boost profits and support its AI platform services; on the other hand, providing massive GPU computing power from Nvidia (and sometimes AMD) to users looking to build their own platforms in the cloud. Currently, only Google's TPU and AWS's Trainium have widely deployed their self-developed AI training accelerators. Microsoft is still developing its Maia chip, and the training version of Meta Platforms' MTIA accelerator is not yet complete. (China's hyperscale data centers and cloud service providers are also developing their own CPUs and XPUs to varying degrees, or collaborating with third-party manufacturers such as Huawei's HiSilicon Semiconductor, to reduce their dependence on Nvidia GPUs.)

Andy Jassy, the current CEO of Amazon and former CEO of AWS for over a decade, stated that Trainium 2 capacity is fully booked and currently represents a multi-billion dollar business, with revenue increasing 2.5 times compared to the second quarter.

Jassy stated that a small number of large customers are using the majority of Trainium2 capacity on his cloud platform, claiming that Trainium2 offers 30% to 40% better value for AI workloads compared to other solutions. The demand for Trainium2 instances on AWS is high as customers seek greater cost-effectiveness when deploying AI applications in production environments. Jassy added, "Most of the token usage in Amazon Bedrock is already running on Trainium," which we believe means that most of the context tokens processed and most of the output tokens generated on Bedrock are processed and generated by computations on Trainium2 (and sometimes Trainium1 or Inferentia2 as well).

Jassy also stated that Anthropic is using the "Project Ranier" supercluster, which the company announced in December 2024, to train its latest 4.X generation Claude model. At the time, AWS and Anthropic stated that Project Ranier would have "hundreds of thousands" of Trainium2 chips, with performance five times that of the GPU cluster Anthropic used to train its Claude 3rd generation models.

Ranier is more powerful than people imagine

Ranier has proven more powerful than anticipated; according to Jassy, the company has 500,000 Tranium2 chips and plans to expand to 1 million Tranium2 chips by the end of this year.

Regarding Trainium3, Jassy stated that a preview version will be released before the end of the year (meaning we can expect to see more information at re:Invent 2025 in December), and as he said, "Larger-scale deployments will arrive in early 2026." He added that AWS has many "large and medium-sized customers who are very interested in Trainium3." This interest is understandable if instances on AWS can offer four times the total capacity and twice the single-chip capacity of Trainium2 UltraCluster. Companies like Anthropic can, like OpenAI, chain together larger clusters of instances, just as OpenAI has previously achieved cluster sizes on Microsoft Azure far exceeding what other customers could rent.

"So of course we have to deliver the chips," Jesse joked, referring to Trainium 3. "We have to deliver in volume, and we have to deliver quickly. We also have to continue to improve the software ecosystem, which is constantly evolving. With more success stories like Project Rainier, and the work Anthropic has done on Trainium 2, Trainium's credibility is also growing. I think customers are very bullish on it. So am I."

Another interesting point Jesse made on the call with Wall Street was the data center capacity AWS is adding. He stated that "in the past year" (we believe he was referring to the past twelve months, a metric Amazon frequently uses), AWS has added 3.8 gigawatts of data center capacity, and will add another gigawatt in the fourth quarter. Jesse did not give a specific figure for AWS's total data center capacity, but he indicated that it will double by the end of 2027. And since the end of 2022, total capacity has already doubled.

"So we're adding a considerable amount of capacity today," Jesse explained. "For the industry as a whole, the bottleneck is likely in electricity. I think at some point, the bottleneck might shift to chips, but we are significantly increasing capacity. And our current rate of capacity growth allows us to translate that into revenue."

Given this, assuming AWS has 4GW of total data center capacity by the end of 2022, it will reach 10GW by the end of 2025. This means that the total capacity could reach around 20GW in two years. For AI data centers, Nvidia's infrastructure costs approximately $50 billion per GW, while self-developed accelerators like Trainium cost approximately $37 billion per GW. Assuming GPUs and Trainium each account for half, the additional 10GW of capacity translates to approximately $435 billion in data center spending in 2026 and 2027. That sounds incredible.

To match the projected 40% growth in GW capacity for 2026 and 2027, assuming AWS invests $106.7 billion in IT equipment in 2025—the vast majority of its projected $125 billion capital expenditure that year, almost entirely in AI infrastructure—its capacity must reach 1.95 GW by the end of 2022, 5.9 GW by the end of 2025, and 11.8 GW by the end of 2027. This translates to $256.7 billion in IT spending for 2026 and 2027 (inclusive). This sounds relatively reasonable, but it also means that while megawatt-level capacity was the norm for large data centers in the past decade or two, it is now insignificant in the GenAI era.

Source: Content compiled from nextplatform

We take your privacy very seriously and when you visit our website, please agree to all cookies used.