Google Cloud Next’s top 4 news nuggets you need to know

  • Google Cloud unveils Ironwood TPU, a 7th-gen AI accelerator chip with exponentially improved performance and double power efficiency over its predecessor
  • Gemini 2.5 Flash debuts as a new workhorse AI model
  • Cloud WAN opens Google’s 2-million-mile global fiber network to customers

GOOGLE CLOUD NEXT, LAS VEGAS –  Surprise! Google Cloud is going hard on AI, connectivity and interoperability this year at its annual Cloud Next event. But the focus has shifted from centralized training to distributed AI and inference.

As usual, there was a flood of headlines coming out of the event, but here are four biggies you really need to know about. 

Ironwood TPU

Even though it was expected, Google still wowed the crowd when it unveiled its 7th generation Tensor Processing Unit (TPU), dubbed Ironwood. The AI accelerator chip can be deployed in clusters as large as 9,216 chips and support computation loads of up to 42.5 Exaflops. That’s exponentially more than the 1.7 Exaflops supported by today’s largest supercomputer, as Amin Vahdat, Google Cloud’s VP and GM for ML Systems and Cloud AI, noted on a press briefing call, 

All that power is designed to deliver the performance needed as AI moves into the age of inference.

“The industry is now entering a new chapter defined by inference, its quality and efficiency,” Vahdat said. “It’s no longer about the data put into the model but what the model can do with the data after it’s been trained.”

Beyond pure horsepower, Vahdat noted that Ironwood is 2x more power efficient than Google’s sixth-generation TPU (Trillium) and nearly 30x more power efficient than the company’s first TPU.

Gemini 2.5 Flash

Google Cloud continued with its theme of efficient AI with the release of Gemini 2.5 Flash. Flash is a streamlined version of its Gemini 2.5 model, designed to be more affordable for everyday use cases, for work like generating summaries of documents or news in real-time, basic coding tasks and function calling, Vahdat said.

It’s basically the workhorse version of Google Cloud’s top tier Gemini 2.5 Pro model.

Flash “automatically adjusts processing time ('thinking budget') based on query complexity, enabling faster answers for simple requests,” Google Cloud’s Jason Gleman wrote in a blog post. “You also gain granular control over this budget, allowing explicit tuning of the speed, accuracy, and cost balance for your specific needs. This flexibility is key to optimizing Flash performance in high-volume, cost-sensitive applications.”

Both Ironwood and Gemini 2.5 Flash provide Google with strategic sales pitches around cost and efficiency – two factors likely to be squarely in focus for companies dealing with the fallout of the new U.S. tariffs

As New Street Research’s Dan Salmon pointed out in a note to investors ahead of the event, companies might be “incrementally reluctant to sign contracts in a downturn,” but Google may be able to move enterprises away from its competitors “with promises of efficiency and net cost savings.”

Oh and one more little Gemini tidbit: Google announced its family of Gemini models will now be available on Google Distributed Cloud, bringing the power of AI on-prem. That’s important for companies like telcos and healthcare organizations, which have strict data residency and privacy requirements.

Cloud WAN

In an I-can't-believe-they-haven't-done-this-before move, Google Cloud announced it is opening its 2 million-mile fiber network to customers. Productized as "Cloud WAN," the new offering is meant to support the connectivity needs of global clients as they adopt AI.

"Our customers can now tap into the same planet-scale network that powers Google's globally available services, including Gmail, YouTube and Search," Vahdat said.

The fully managed service will provide 40% better network performance than the public internet and more than 200 points of presence across the globe, Angelo Libertucci, Google Cloud's Global Industry Lead for Telecom, added on a call with press.

But telcos aren't being cut out of the picture. The service comes with a Verified Peering Provider program that allows customers to tap local ISPs for additional availability. And starting later this year, Lumen will team with Google Cloud to provide last-mile handoff between Cloud WAN and data centers, offices, warehouses and airports.

ADK & A2A

Ok, so we're ending this with a two-fer. First up was the announcement of Google Cloud's new ADK - no, not an application development kit but an agent development kit. The idea behind this is to make it easier for developers to build, test and run interoperable AI agents.

"With ADK, customers can easily build a multi-agent system in under 100 lines of code, precisely steer agent behavior with creative reasoning and strict guardrails, work with a preferred model....and deploy at enterprise scale with Vertex AI," Vahdat said.

To help facilitate the interoperability, Google Cloud announced its Agent2Agent protocol. A2A will allow agents across the enterprise ecosystem to connect with one another, irrespective of which vendor or framework they’re built on. We took a deep dive into why A2A and the model context protocol Google Cloud is supporting matter - check it out here.

ADK also comes with a companion Agent Garden, which features examples and sample agents, as well as an AI Agent Marketplace with prebuilt options from the likes of SAP, Oracle, NetApp and many others. 

"2025 will be a transition year where Generative AI shifts from answering single questions to solving complex problems through agentic systems," Vahdat added. "These agents will be able to carry out a range of tasks, from planning trips to monitoring customer issues and managing complex workflows."


Want more? Check out all our Google Cloud Next coverage here