Arista Networks (NYSE: ANET), a leading provider of cloud and AI networking solutions, today announced the Arista Etherlink™ AI platforms, designed to deliver optimal network performance for the most demanding AI workloads, including training and inferencing.
Powered by new AI-optimized Arista EOS features, the Arista Etherlink AI portfolio supports AI cluster sizes ranging from thousands to 100,000s of XPUs with highly efficient one and 2-tier network topologies that deliver superior application performance compared to more complex multi-tier networks while offering advanced monitoring capabilities including flow-level visibility.
“The network is core to successful job completion outcomes in AI clusters,” said Alan Weckel, Founder and Technology Analyst for 650 Group. “The Arista Etherlink AI platforms offer customers the ability to have a single 800G end-to-end technology platform across front-end, training, inference, and storage networks. Customers benefit from leveraging the same well-proven Ethernet tooling, security, and expertise they have relied on for decades while easily scaling up for any AI application.”
Arista’s Etherlink AI Platforms
The 7060X6 AI Leaf switch family employs Broadcom Tomahawk 5® silicon, with a capacity of 51.2 Tbps and support for 64 800G or 128 400G Ethernet ports.
The 7800R4 AI Spine is the 4th generation of Arista’s flagship 7800 modular systems. It implements the latest Broadcom Jericho3-AI processors with an AI-optimized packet pipeline and offers non-blocking throughput with the proven virtual output queuing architecture. The 7800R4-AI supports up to 460 Tbps in a single chassis, which corresponds to 576 800G or 1152 400G Ethernet ports.
The 7700R4 AI Distributed Etherlink Switch (DES) supports the largest AI clusters, offering customers massively parallel distributed scheduling and congestion-free traffic spraying based on the Jericho3-AI architecture. The 7700 represents the first in a new series of ultra-scalable, intelligent distributed systems that can deliver the highest consistent throughput for very large AI clusters.
A single-tier network topology with Etherlink platforms can support over 10,000 XPUs. With a 2-tier network, Etherlink can support more than 100,000 XPUs. Minimizing the number of network tiers is essential for optimizing AI application performance, reducing the number of optical transceivers, lowering cost and improving reliability.
All Etherlink switches support the emerging Ultra Ethernet Consortium (UEC) standards, which are expected to provide additional performance benefits when UEC NICs become available in the near future.
“Broadcom is a firm believer in the versatility, performance, and robustness of Ethernet, which makes it the technology of choice for AI workloads,” said Ram Velaga, senior vice president and general manager, Core Switching Group, Broadcom. “By leveraging industry-leading Ethernet chips such as Tomahawk 5 and Jericho3-AI, Arista provides the ideal accelerator-agnostic solution for AI clusters of any shape or size, outperforming proprietary technologies and providing flexible options for fixed, modular, and distributed switching platforms.”
Arista EOS Smart AI Suite
The rich features of Arista EOS and CloudVision complement these new networking-for-AI platforms. The innovative software suite for AI-for-networking, security, segmentation, visibility, and telemetry features brings AI-grade robustness and protection to high-value AI clusters and workloads. For example, Arista EOS’s Smart AI suite of innovative enhancements now integrates with SmartNIC providers to deliver advanced RDMA-aware load balancing and QoS. Arista AI Analyzer powered by Arista AVA™ automates configuration and improves visibility and intelligent performance analysis of AI workloads.
“Arista’s competitive advantage consistently comes down to our rich operating system and broad product portfolio to address AI networks of all sizes,” said Hugh Holbrook, Chief Development Officer, Arista Networks. “Innovative AI-optimized EOS features enable faster deployment, reduce configuration issues and deliver flow-level performance analysis, and improve AI job completion times for any size AI cluster.”