UK-based IT company Graphcore unveils next-gen AI chip called ‘Bow’ to speed up AI by 40%
Graphcore, a UK-based IT company, improved the performance of his computers without changing anything to their specialized AI processor cores. TSMC’s edge-to-edge 3D integration technology was used to attach a power chip to Graphcore’s AI processor during manufacturing.
According to Graphcore, its new combo chip, named Bow, is the first on the market to use wafer-to-wafer bonding. The Bow can now operate at a higher frequency – 1.85 GHz compared to 1.35 GHz – and at a lower voltage than its predecessor due to the power distribution silicon. The result is machines that can train neural networks up to 40% faster while using up to 16% less energy than previous generations. Users benefit from this enhancement without making any changes to their software.
Multiple silicon chips will be brought together to augment the performance gains achieved through increasing progress along an ever-slower Moore’s Law path. Bow and the Colossus MK2 (its predecessor) were produced with TSMC’s N7 manufacturing technology.
Mined chips are linked to other chips or wafers in various 3D chip stacking technologies, such as Intel’s Foveros. Two whole wafers of chips are bonded together in TSMC’s SoIC WoW technology. When the wafers are aligned, the copper pads on the chips come together. The pads ignite when the two pads are brought together. You can see a cold weld between the pads. The bonded wafer is then cut into chips after flattening the upper wafer to just a few micrometers.
One wafer has 1,472 IPU cores and 900 megabytes of on-chip memory from Graphcore’s second-generation AI processor (the company calls them IPUs for Intelligence Processing Units). These processors were already in commercial use and performed well in the last round of MLPerf tests. A comparable set of energy delivery devices was found on the opposite wafer. There are no transistors or other active components on these devices. Instead, they’re filled with silicon through-hole capacitors and vias, which are vertical connections. The latter provides power and data connections to the processor chip through the power supply chip.
Capacitors are the ones that really make the difference. Like the bit-storage capacitors in DRAM, these components are created in deep, narrow trenches of silicon. The power supply is smoothed by positioning these charge reservoirs near the transistors, allowing the IPU cores to operate faster at a lower voltage. The IPU would need to raise its operating voltage above its rated level to operate at 1.85 GHz without the power supply device, using more power. The power chip can reach this clock rate and consume less power.
Wafer-on-wafer technology allows for a greater density of connections between processors than mounting individual chips on a wafer. However, the problem of the “known good pathway” was a long-standing difficulty with this approach. There are always a few defective chips in a batch of wafers. By merging two wafers together, the number of defective chips could be doubled.
Graphcore’s solution is to allow things to happen to a certain extent. The IPU, like several other emerging AI processors, is composed of a large number of identical (and therefore redundant) processor cores and other elements. Any misfires can be cut off from the rest of the IPU using built-in fuses.
Although the new product’s power supply device lacks transistors, they might be on the way. Using only technology for power distribution is “just the first step”. In the not so distant future, it will go much further.
Supercomputers capable of training “brain-scale” AIs, which are neural networks with hundreds of trillions of parameters, could soon be developed using this technology. The “Good” computer, named after British scientist IJ “Jack” Good, is said to have a processing power of more than 10 exaflops, or ten billion billion floating point operations. 512 systems with 8192 IPUs and mass storage, processors and network would do. It will have a memory capacity of 4 petabytes and a bandwidth of over 10 petabytes per second. Each supercomputer is expected to cost around $120 million and be ready to ship in 2024, according to Graphcore.