Fujitsu’s custom chip, the first to use supercomputing extensions to the ARMv8-A instruction set, is set to come into operation with the post-K supercomputer around 2021
Fujitsu has given the first details on a high-end, ARM-based processor it plans to use in an upcoming exascale supercomputer.
Called the A64FX, the chip is being developed from scratch for the “post-K” supercomputer, intended as the successor to the K computer, which set supercomputing records in 2011.
The post-K system, being developed by Fujitsu and Riken, is intended to be up to 100 times faster than the K computer, and is expected to come into operation sometime around 2021.
Exascale systems, those that perform at at least one exaFLOP, or one billion billion calculations per second, represent unexplored territory, with the first such supercomputers expected to come into play around 2020.
Earlier this year researchers at the US Department of Energy’s Oak Ridge National Laboratory broke the exascale barrier, achieving peak throughput of 1.88 exaops while analysing genomic data on the recently launched Summit supercomputer.
The ORNL researchers achieved the breakthrough by using a mixture of high-precision and reduced-precision calculations, which can be carried out at increased speeds.
Fujitsu said it hopes the post-K computer will set a new performance record when it eventually comes online.
At the Hot Chips high-performance processing conference in Silicon Valley this week, the company said the A64FX chip would be the first to adopt the Scalable Vector Extension (SVE), an extension of the ARMv8-A instruction set designed for supercomputing.
Fujitsu was a lead partner collaborating with ARM on the development of SVE, and developed the microarchitecture of the A64FX itself, building on its previous experience with supercomputers, mainframes and UNIX servers.
The chip offers peak performance of more than 2.7 TFLOPS and supports massive parallisation through a new generation of the Tofu interconnect originally developed for the K computer.
It features low power consumption, mainframe-class reliability and supports a wide range of applications, Fujitsu said.
Its high memory bandwidth and high-performance stacked memory allows it to efficiently use the chip’s high-functional CPUs for greater performance.
The chip performs at 2.7 TFLOPS for double-precision, 64-bit floating point operations, with twice the rate for single-precision, 32-bit and four times that figure for half-precision, 16-bit operations.
This means that, as with the Summit system, researchers can use single or half-precision operations to achieve faster results.
Support for high-performance 16-bit and 8-bit integer operations means the chip can be used for a wide range of applications, including big data and AI processing, as well as the complex simulations usually associated with supercomputers.
Fujitsu said the system is also intended for simulations in fields such as health and longevity, disaster prevention and manufacturing and industry.