The Clos topology, a concept with a 70-year history, was first introduced in a scholarly article by the American engineer Charles Clos during his tenure at Bell Labs. His paper, titled "A Study of Non-Blocking Switching Networks," explored the foundations of what would become known as “Clos”, a cornerstone in network design. This approach has been widely implemented.
However, when it comes to the specifics of connecting AI clusters, there are two options—Clos and chassis—and they have notable differences. When forced to choose between these approaches, there is always a compromise. So really, AI workloads running in a data center need a solution that comprises both.
What is Clos?
In AI terminology, a Clos is built to connect endpoint devices in a data center (built to run AI applications) where these endpoints are servers (network interface cards, or NICs, installed onto servers). The entry point into the Clos crossbar is the top-of-rack (ToR) switch to which these servers are physically connected. This ToR is then connected to several aggregation switches called spines or fabrics or tier-2 switches. It’s important to note that these spines connect to ToR switches only and to all the ToR switches in the network.
In Ethernet technology, the entry ToR switch needs to decide per every flow regarding the spine that will be used. This flow management is based on hashing algorithms that are supposed to be as close to random as possible.
Another technology implementing the same topology is InfiniBand, enhanced with additional information from an external brain with full visibility of the entire crossbar.
How is a chassis built?
Chassis construction involves a systematic assembly of various components. This functions collectively, creating a robust and efficient network infrastructure. In essence, the concept of a “chassis” can be described as a “Clos in a box.”
Networking topologies are as good as their operational ease
The discussion surrounding network topologies often leads to a comparison between Clos and chassis models. The distinct attributes make each model a preferable choice for certain scenarios.
Chassis is easy to handle. It is very efficient when operating at its optimal working point: Single control plane, Single IP address. You can choose the size that fits your deployment and just fill it with line cards. There is only one vendor to point a finger at when anything goes wrong, and it provides the most predictable behavior due to built-in internal mechanisms (which are crucial for AI workloads).
The Ethernet Clos topology, distinct from the chassis model, excels in areas where the latter may not be as efficient. It is composed of small and typically lower cost switches. It is very easy to grow and to make changes in the network. Scaling is based on scale-out, rather than scale-up. When building your network, you can also work with multiple vendors, so you have better control of the supply chain and eventually the price.
The problem is that every element in the Clos is also an element in the network. For large deployments, you could end up with hundreds of devices to manage and a network between them that also needs to be managed, monitored, protected, troubleshot, and fixed.
There is always a compromise when forced to choose between them. Service providers are more inclined towards the performance of a chassis, while data centers desire the scale of Clos.
AI workloads running in a data center need characteristics from both of these approaches.
What is the Distributed Disaggregated Chassis (DDC) operational model?
The Distributed Disaggregated Chassis (DDC) model is in fact a chassis without a metal enclosure.
DDC is built using the same components found in a chassis. In this case, it is not built into a single enclosure but distributed into several stand-alone devices that act as “line cards” and “spine (fabric) cards.” This disaggregation attribute enables these stand-alone devices to be purchased as standard white boxes, separate from the software that runs them.
So far this seems like the typical Clos, but DDC offers another element. This is the brain of the system. It is where control over these white boxes happens and the shared “knowledge” of the network is kept. This means that the DDC behaves as a unified element in two key ways:
- it can be managed as a single network device.
- it carries a very predictable behavior with traffic flows thanks to its segmentation and reassembly actions, same as in a chassis.
So DDC offers the best of both worlds, from Clos to chassis. On the one hand DDC brings the Clos attributes of flexibility, scale, and multi-vendor. On the other hand, DDC brings the performance and behavior of a chassis, with a single point of management and a single control plane that promotes improved performance of the entire interconnect solution.
The best of Clos and chassis: The Distributed Disaggregated Chassis
In an analysis of the chassis model, particularly in the context of AI networking, the rigidity of the chassis framework poses limitations in terms of scalability and adaptability to changing network demands. Conversely, the distributed nature of the Clos topology effectively addresses these limitations, making it well-suited for environments with changing demands. Additionally, the Clos model allows for greater vendor independence and customer control. This is largely due to its compatibility with standard white box designs, which are not tied to any specific vendor, thereby reducing vendor lock-in.
The optimal solution for AI networking is a hybrid approach that combines the best of both worlds: a distributed and disaggregated form of a chassis. This concept aligns with the principles set out by the Open Compute Project (OCP), which defined this approach, DDC (Distributed Disaggregated Chassis). The DDC model represents a forward-thinking approach in network architecture, blending the robustness and management efficiency of a chassis with the scalability and flexibility of a distributed Clos topology.