In August 2022, the full version of the Compute Express Link (CXL) 3.0 standard was announced, and major semiconductor manufacturers are already planning its implementation in upcoming CPUs for data center processing. The prior versions of the standard are based on PCIe generations with the same physical architecture; CXL 3.0 continues this trend and takes advantage of the bandwidth and performance improvements in PCIe 6.0. Areas like AI in the data center, use of accelerator products in servers, and further expansion of cloud computing will continue to be enabled by CXL 3.0.
While new products implementing CXL 3.0 as a standard integrated feature are not projected to reach the market until 2025, system architectures taking advantage of the standard are being prepared to support these products. New system architectures involve connections between CPUs and high-speed peripherals, such as memories and accelerators for AI and other HPC tasks.
What’s New in CXL 3.0
The Compute Express Link (CXL) 3.0 standard brings several improvements over prior generations of the CXL standard. Portions of these improvements are enabled by taking advantage of PCIe within a CXL connection, while others exist in the firmware/software layer defining the CXL protocol:
- Bandwidth and latency: CXL 3.0 takes advantage of the higher bandwidth in PCIe 6.0 (64 GT/s), which uses 4-level pulse amplitude modulation (PAM-4) and forward error correction (FEC) for higher data rate and data recovery at receiver ends of CXL channels.
- Improved memory sharing: CXL 3.0 allows memory pool sharing across multiple peripherals and processors that share a connection to a CXL switch.
- Backward compatibility: CXL 3.0 is fully backward compatible with earlier CXL standards, which is performed by host or device downgrade to the earliest version of CXL across the shared fabric.
- Fabric architecture: The physical layer in a CXL 3.0 link is highly scalable to a large number of processors and peripherals through the use of multi-tiered switching and a fabric-based physical architecture.
- Increased scaling: The CXL 3.0 protocol can support up to 4,096 nodes within the fabric architecture.
Peripherals that are compatible with CXL 3.0 are compatible with the physical layer specifications in PCIe 6.0. In other words, the physical layer (pins, connectors, electrical characteristics, etc.) of a PCIe connection are used in a CXL connection. In terms of the PCB layout for these systems, layout and routing rules used in PCIe 6.0 are also used for CXL 3.0. The difference is that the data transmitted over those traces are based on the CXL protocol once the CXL connection is established.
Keep in mind that while the physical and electrical interfaces are compatible, CXL and PCIe use different protocols at the software layer. Thus, the driver and firmware support for CXL and PCIe will differ. For cores instantiated in FPGA fabric, vendor IP is needed to implement the logic and software layers that support the protocol.
Towards Composable Architectures
CXL 3.0’s scalability enables composable architectures within data center architecture, but not just between high-performance servers and networking equipment. The traditional topology in server architecture is a tree topology, where the CPU acts as the host for all other peripherals, including co-processors and accelerators. CXL 3.0 enables a composable architecture, also known as composable infrastructure, which is a non-tree topology that can take advantage of shared resources as outlined above.
In a composable architecture, a high-level system can dynamically "compose" or construct the resources it needs from pools of disaggregated, abstracted resources, such as compute, storage, and networking components. In a composable architecture, each server is broken down into its various components (CPU, memory, networking, etc.). These components are then pooled and shared across compute devices such that they can be dynamically assigned or reassigned to different workloads on-the-fly.
For instance, if a certain workload needs more storage but doesn't need much CPU power, a composable architecture allows just the right amount of storage and CPU power to be allocated to that workload. Similarly, if a workload needs more compute resources for a temporary period (like for a compute-intensive task), additional CPU resources can be assigned to that workload just for that period. Once the task is done, the resources can be freed and returned to the pool for use by other workloads.
This allows for a much more flexible and efficient utilization of data center resources. Normally this would be handled in software and performed between different servers, where a classic case is parallel computing. This type of disaggregation will be enabled across compute resources in a data center as well as within a single server. This is a major change to the typical server and data center architecture, and it is only enabled by a protocol like CXL 3.0
Design teams building advanced data center architecture can design and evaluate system architecture and capabilities with the complete set of system analysis tools from Cadence. Only Cadence offers a comprehensive set of circuit, IC, and PCB design tools for any application and any level of complexity. Cadence PCB design products also integrate with a multiphysics field solver for thermal analysis, including verification of thermally sensitive chip and package designs.