Google's Ironwood TPU Powers Advanced AI Models

Ironwood: The Engine Behind Google’s Next AI Breakthroughs

Google has unveiled Ironwood which represents its seventh-generation Tensor Processing Unit (TPU) that incorporates custom chip design to transform its artificial intelligence capabilities. The new architecture represents a major strategic advancement designed to meet the complex requirements of Google’s most advanced Gemini models. Ironwood has been specifically designed to perform simulated reasoning tasks which Google refers to as “thinking.”

The company firmly believes in the mutual benefits created by its advanced AI models working together with its specialized infrastructure. Ironwood demonstrates this philosophy through its ability to boost inference speeds and extend the context windows available to advanced models. Google believes Ironwood represents their most advanced and scalable TPU at present, which establishes the foundation for AI systems that can autonomously collect data and produce results for users’ benefit. Google’s “agentic AI” vision revolves around a proactive user-focused methodology, and Ironwood functions as the primary force propelling this innovative era of inference.

Performance Unleashed: Ironwood’s Impressive Specs

Ironwood achieves a significant throughput enhancement when compared to earlier Google TPU models. The company intends to implement a large-scale deployment consisting of up to 9,216 liquid-cooled Ironwood chips functioning together as a unified cluster. The new enhanced Inter-Chip Interconnect (ICI) ensures that these massive arrays can communicate seamlessly across the system while delivering high-bandwidth and low-latency data exchange.

Both Google’s internal operations and cloud development teams will gain access to this powerful processing capability. Ironwood will be available in two configurations: The product lineup includes a 256-chip server designed for smaller applications and a massive 9,216-chip cluster that addresses the highest AI processing demands.

The sheer computational power of a full Ironwood pod is staggering: 42.5 Exaflops of inference computing. Each Ironwood chip achieves a peak throughput of 4,614 TFLOPs which represents a significant advancement compared to previous generations based on Google’s findings. Each chip now includes 192GB of memory which represents a sixfold improvement since the Trillium TPU era. This achievement includes a 4.5 times increase in memory bandwidth which now reaches 7.2 Tbps.

Contextualizing the Power: Ironwood’s Place in the AI Landscape

The evaluation of AI chip performance proves difficult because different measurement techniques exist. Google establishes FP8 precision as the performance standard for Ironwood. The company’s assertion that Ironwood “pods” deliver 24 times the performance of similar segments in leading supercomputers should be interpreted cautiously because many supercomputers lack native FP8 hardware support.

Google’s TPU v6 model (Trillium) did not appear in their direct performance comparison data. The company declares that Ironwood delivers double the performance per watt when compared to v6. Google confirmed Ironwood as the successor to TPU v5p whereas Trillium followed the lower performance TPU v5e. At FP8 precision Trillium reached a peak performance level around 918 TFLOPS.

The Road Ahead: Ironwood and the Future of AI

Despite the complexities of benchmarking, the message is clear: Ironwood marks an important advancement in the development of Google’s AI systems. Ironwood achieves advanced speed and efficiency, building on the strong foundation that enabled rapid advancements in previous models like Gemini 2.5, which uses older TPUs.

Google believes that Ironwood’s enhanced inference capabilities and efficiency will lead to further groundbreaking AI advancements next year. Ironwood delivers essential computational power for advanced models and agentic functionality which positions it as a crucial component for Google’s “age of inference” strategy that aims to make AI a proactive and essential element of our digital existence.