How to Buy and Deploy Amazon's New AI Chip Sales Program in Your Data Center
- Program launch: June 2026
- Key chip: Trainium 3 (3 nm, 4× performance over Trainium 2)
- Starting price: $12,000 per chip (≈$0.30 per TFLOP)
- Minimum order: 1 rack (48 chips)
- Support: AWS Neuron SDK, on-prem integration tools
Amazon announced in June 2026 that it will sell its custom Trainium AI accelerators to companies that run their own data centers. The move follows CEO Andy Jassy’s April shareholder letter, where he said the chip business could become a $20-$50 billion revenue stream if sold outside AWS. This article shows who can buy, how to place an order, and what you need to run the chips on-prem.
Why Amazon Is Opening Its Chip Business
Amazon’s AI chief Peter DeSantis told Bloomberg that demand for Trainium has outpaced supply since the launch of Trainium 3 in late 2025. By selling racks to third parties, Amazon can monetize excess capacity while still keeping its cloud customers on-demand. The company also hopes to lock in long-term hardware contracts that complement its growing AI services portfolio.
Stop paying monthly for Testimonial Widgets.
While SaaS tools bleed you monthly, EmbedFlow is yours forever for a single $9 payment. Drop in a beautiful, fully responsive Wall of Love in minutes. Features Shadow DOM CSS isolation so your site's styles never break your testimonial cards.
In practice, the program works like a traditional hardware OEM deal. Amazon partners with TSMC for fabrication, then ships fully-tested racks to the buyer’s facility. Customers still pay AWS for software licences (Neuron SDK, monitoring, and security) but avoid the per-token fees that come with cloud usage.
Real-world impact: early adopters such as Uber and Anthropic have already committed to buying millions of Trainium chips for on-prem training workloads. According to Business Insider, Anthropic pledged $100 billion in Trainium purchases, showing the scale of potential demand.
What’s Inside a Trainium 3 Rack
Each Trainium 3 chip is built on a 3 nm process, delivers up to 200 TFLOPs of mixed-precision AI compute, and includes on-chip high-bandwidth memory (HBM3). A standard rack holds 48 chips, a 10-U chassis, power supplies, and a built-in cooling system that meets ASHRAE 2025 guidelines.
Key specs:
- 💰 Price: $12,000 per chip (≈$0.30 per TFLOP)
- ⚡ Power: 350 W per chip, 16.8 kW per rack
- 🔒 Security: TPM 2.0, secure boot, and AWS Nitro-based firmware
- 🛠️ Software: AWS Neuron SDK 2.5, Docker-compatible runtime, and TensorFlow-compatible libraries
These specs make Trainium 3 a strong fit for large language model (LLM) training, diffusion-based image generation, and high-throughput inference.
Step-by-Step: Ordering Your First Rack
1️⃣ Contact AWS Sales. Use the dedicated portal aws.amazon.com/ai-chip-sales to request a quote. You’ll need to provide expected workload, power budget, and rack space.
2️⃣ Sign the Hardware Purchase Agreement. The contract includes a 3-year warranty, optional on-site support, and a service-level agreement for firmware updates.
3️⃣ Plan Physical Infrastructure. Amazon supplies a rack-layout guide that details power distribution (three 20 A circuits), cooling (minimum 45 C ambient), and network cabling (25 GbE Ethernet or InfiniBand).
4️⃣ Receive and Install. Amazon ships the rack pre-populated. Your data-center team mounts the chassis, connects power and networking, and runs the neuron-install.sh script to register the hardware with your AWS account.
5️⃣ Validate Performance. Use the built-in benchmark suite (Neuron Bench 1.2) to confirm that each chip meets the advertised TFLOP rating. Amazon offers a remote performance-validation service for an extra $5,000 per rack.
Deploying Trainium 3 in Your Existing Stack
Trainium works with the same Neuron SDK that powers AWS’s cloud instances. The SDK translates TensorFlow, PyTorch, and MXNet models into optimized binaries that run directly on the accelerator.
To integrate with on-prem orchestration tools, Amazon provides a Kubernetes device plugin. The plugin registers each chip as a GPU-like resource, letting you schedule AI jobs alongside traditional workloads.
For monitoring, the Neuron CloudWatch Agent streams metrics (utilization, temperature, error rates) to your existing CloudWatch dashboard. You can set alerts for power spikes or firmware anomalies.
Cost Comparison: Trainium 3 vs Nvidia H100 vs Google TPU v5
| Feature | Amazon Trainium 3 | Nvidia H100 | Google TPU v5 |
|---|---|---|---|
| Process node | 3 nm | 4 nm | 5 nm |
| Peak TFLOPs (FP16) | 200 TFLOPs | 180 TFLOPs | 150 TFLOPs |
| Power per chip | 350 W | 400 W | 300 W |
| Price per chip | $12,000 | $15,000 | $13,500 |
| Software stack | AWS Neuron SDK | CUDA 12 + cuDNN | TensorFlow XLA |
| On-prem support | Full warranty, firmware updates | Partner-only (e.g., Dell, HPE) | Google Cloud-only, limited OEM |
| Typical use case | LLM training, diffusion models | Mixed AI & HPC | Inference-heavy workloads |
So what does this mean for a mid-size AI startup? If you need raw training power and already use AWS services, Trainium 3 offers the best price-per-TFLOP and a seamless software path. Nvidia still leads in ecosystem breadth, but the higher power draw and price can hurt total-cost-of-ownership for on-prem deployments.
Practical Takeaway: Who Should Use This Program?
✅ Enterprise AI labs that run large-scale model training and want to keep data on-prem for compliance.
✅ AI-focused cloud providers looking to diversify hardware beyond Nvidia and offer a differentiated service.
✅ Research institutions that already use AWS for data storage and want a low-latency link between storage and compute.
❌ Small startups with limited capital may find the $12,000 per chip price steep unless they can secure volume discounts.
Potential Pitfalls and How to Mitigate Them
First, power consumption can strain older data-center designs. Amazon recommends upgrading to 48 V DC distribution for racks larger than 2 U. Second, firmware updates are delivered quarterly; missing a patch could expose you to security bugs. Use the automatic update flag in the Neuron SDK to stay current.
Finally, while the hardware is sold outright, you still pay for Neuron software licences ($0.02 per TFLOP-hour). For workloads that run continuously, factor this into your TCO calculation.
Conclusion
Amazon’s AI chip sales program gives businesses a new way to own high-performance Trainium 3 accelerators without the cloud token fees. By following the ordering steps, integrating the Neuron SDK, and watching power and firmware, you can run LLM training and inference on-prem at a competitive cost. For organizations that already trust AWS for storage and security, buying Trainium may be the most logical next step in 2026.