Networking is the ‘missing middle’ that’s slowing down AI

  • Networking is the overlooked factor for AI data centers, crucial for moving large data loads
  • AI workloads strain existing networks, slowing down training speeds and leading to wasting expensive GPU resources
  • Download our free report to find out how optimizing networks for AI can boost performance by up to 30% and support growing data demands

What are the constraints on AI data centers? GPUs, of course. Electrical power. Water for cooling.

You forgot something — but don’t feel bad. Most people do. AI also requires networking to move around the massive data needed for training and inference. Today’s networks are straining under the burden of AI data loads, which are getting bigger and bigger.

Networking is the forgotten “missing middle” for AI, said Joel Moses, distinguished engineer and CTO for systems and platforms for F5, which specializes in application security and performance and multi-cloud management.

“Cloud providers have done a brilliant job of making the network an afterthought, whereas the people who adopt this technology are faced with having to think about the network again,” Moses said.

He added, “The technology is moving so fast that networking seems like a solved problem. But networking is actually the thing that’s potentially keeping your training speed lower.”

F5 sponsored the latest free Fierce Network Research report and webinar: “AI and the Network: Optimizing Network Design and Operations to Meet AI Demands.” Download the report and watch the webinar at the link.

Network constraints cause bottlenecks that lead to graphics processing units (GPUs) running under capacity. Those GPUs are astronomically priced – too expensive to under-utilize. Nvidia’s flagship H100 chip is priced at $25,000 each, and Meta alone will acquire 350,000 H100s, valued at more than $10 billion.

AI applications require enormous amounts of data, and that volume is growing fast. Overall data center storage capacity is projected to rise from 10.1 zettabytes (ZB) in 2023 to 21 ZB in 2027 according to JLL, a five-year compound annual growth rate of 18.5%.

To explore how networks are adapting to carry these extra data loads, we talked with leaders in AI networking from F5, as well as network operators Lumen and AT&T and Vultr, an up-and-coming cloud provider.

Read our report to learn about:

  • What AI strain on network infrastructure means for operators
  • How F5, Lumen, AT&T and Vultr are optimizing their networks for AI
  • Massive fiber buildouts supporting AI workloads
  • How cloud-native software impacts network performance and AI operations
  • Strategies to overcome geographic data transfer challenges for AI

Also, learn how optimizing networks can improve AI performance by up to a whopping 30% — and maybe contribute to saving the planet from climate change.

And once you’ve read the report, you’ll know why I’m concluding this article with a list of the the greatest driving songs: