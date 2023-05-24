guide AMD’s Instinct MI300 is shaping up to be an incredible chip, integrating CPU and GPU cores and a lot of high-speed memory on the same processor, but the details are still…

AMD’s Instinct MI300 is shaping up to be an incredible chip that packs CPU and GPU cores and tons of high-speed memory on the same processor, but details are still scant. Now, we’ve gleaned some new details from an International Supercomputing (ISC) 2023 presentation outlining the upcoming dual-exaflop El Capitan supercomputer powered by the Instinct MI300. We also found additional details in AMD CTO Mark Papermaster’s keynote at the ITF World 2023 conference hosted by research giant imec (you can read our interview with Papermaster here).

The El Capitan supercomputer is expected to be the fastest supercomputer in the world when it launches in late 2023, taking the lead from AMD-powered Frontier. AMD’s powerful Instinct MI300 will power the machine, and new details include a topology diagram of the MI300 installation, images of AMD’s MI300 lab in Austin, and images of the new blades that will be used in the El Capitan supercomputer. We’ll also introduce some other new developments around El Capitan deployments.

As a reminder, the Instinct MI300 is a data center APU that mixes a total of 13 chiplets, many of which are 3D stacked, to create a single chip package with 24 Zen 4 CPU cores, incorporating a CDNA 3 graphics engine and 8 HBM3 memory stacks totaling 128GB. Overall, the chip has 146 billion transistors, making it the largest chip AMD has ever put into production. Nine compute dies, a mix of 5nm CPUs and GPUs, are stacked in 3D on top of four 6nm base dies, which are active interposers that handle memory and I/O traffic, among other functions.

Papermaster’s ITF World keynote focused on AMD’s “30×25” goal of increasing energy efficiency 30 times by 2025, and how computing is now limited by power efficiency as Moore’s Law slows. The key to this plan is the Instinct MI300, which gains most of its gains from the simplified system topology you see above.

As you can see in the first slide, the Instinct MI250 powered nodes have separate CPUs and GPUs, with an EPYC CPU in the middle to coordinate the workload.

In contrast, the Instinct MI300 includes a built-in 24-core fourth-generation EPYC Genoa processor inside the package, removing a discrete CPU from the equation. However, the same overall topology remains, without separate CPUs, resulting in a fully connected all-to-all topology with four elements. This type of connection allows all processors to talk directly to each other without another CPU or GPU acting as an intermediary to relay data to other elements, reducing latency and reliability. This is a potential pain point of the MI250 topology. The MI300 topology diagram also shows that each chip has three connections, just like we saw on the MI250. Papermaster’s slide also refers to the active interposer that forms the schema as a “fourth-generation infinite fabric schema.”