Italian super-green speed demon hits 3MFLops per Watt of power
The new “Eurora” supercomputer, at Italy’s CINECA centre in Bologna, has set a record for energy-efficient computing – and shares a room with Fermi, a supercomputer built to IBM’s Blue Gene/Q specs, which was ranked ninth in the world in the most recent (November 2012) Top5oo list of supercomputers.
The super-green computer delivers 3,150 Megaflops per Watt [a “Flop” is a floating point operation per second], which shattered the previous energy efficiency record for supercomputers, and is based on Intel CPUs with NVidia Kepler co-processors. It has warm-water cooling and an innovative Torus 3D network. TechWeekEurope Italy went to CINECA to see the new eco-computing marvel, and also meet its older brother, the Blue Gene/Q system which CINECA named after the Italian scientific genius Enrico Fermi.
Eurora is based on the Aurora Tigon hybrid supercomputer launched in November by the EuroTech supercomputer group, and named after a tiger-lion crossbreed. Like most new fast supercomputers, Tigon has nodes combining regular CPUs with co-processors – in this case each node combines two Xeon E5 CPUs, and two accelerators, which are either NVidia Kepler GPUs, or the competing but still largely untried Intel Phi accelerators launched in November.
For its supercomputer, CINECA stuck with the NVidia option, naming the device “Eurora” to distinguish it from the generic Aurora design. The system was funded to the tune of €1 million, within the Partnership for Advanced Computing in Europe (PRACE) and built at CINECA, a non-profit centre funded by 54 Italian universities and research institutes.
The project was conceived in August 2011, and really got started in July 2012, when the hardware was bought. CINECA’s Eurora will be used in various fields, to study the constituents of matter, astrophysics and the study of weather, and can deliver a sustained performance of 1700 Gigaflops per node.
The project is not just about setting records in computing power, says Carlo Cavazzoni, of CINECA’s department of high performance computing (HPC). What he wants to do is overcome other limitations in energy, footprint, and cost – measured in Flops per Watt, Flops per square metre, and Flops per Dollar.
“Right now, CINECA’s Eurora is at the top of the charts for worldwide efficiency,” he said. “We will see if it can hold the position until the next update of the published list in the spring.”
Delivering 3150 MFlops for every per Watt, the system is comfortably ahead of the leader of the current published list, the Beacon system at NICS Tennessee, with 2499 MFlops per Watt. Supercomputers typically burn more energy than conventional systems, but the Eurora is around fifteen times more efficient than a desktop system.
For a more simple measurement of plain energy efficiency CINECA uses the PUE (power usage effectiveness) which is derived by dividing the power used by that delivered to the IT kit. Here, it again sets a supercomputing record of just 1.05 , meaning that for every Watt delivered to the computing part of the hardware, only 50mW is needed to run the cooling circuits.
The system has no superfluous components. There are no fans, because the system is cooled by water – and hot water turns out to be the most efficient method. All components are soldered on the motherboard, including the RAM, in a “dense” architecture design.
The cooling system is the real strength of Aurora Tigon. Filtered. de-ionised water is used at a temperature up to 50 degrees, with pressure between 2 and 4 bar, to keep the components at their optimal temperature of about 80 degrees. The CPU and GPU loads are cooled separately and even in summer the cooling costs are low, with minimal pollution levels without the need for chillers.
The water circulates in special bars around each node, pumped from the bottom upwards, while the rack’s 10 kW electrical power arrives from above.
Each node can work with two Intel Xeon E5-2687W processors, with 16 cores, using up to 150 watts. The CINECA system uses two Nvidia Kepler K20 GPUs, each with a performance of 3.52TFlops .
Each node has up to 64 GB of ECC DDR3 RAM . The architecture holds eight nodes side by side in a chassis, and up to 16 chassis per rack. This means the Aurora Tigon can hold up to 2048 cores per rack: CINECA showed us units with 128 CPUs and 128 GPUs.
The system is monitored with a Linux system that uses three independent sensors on redundant network with power control for each of the GPU slots. The system can cost from €1.4 million to €2.4 million (CINECA’s is not fully loaded obviously). For its network the Aurora Tigon has 40Gbps, but it can go up to more than 240Gbps.
Eurora’s room mate, the Blue Gene/Q supercomputer known as Fermi, was ranked ninth in the world in November. IBM’s Blue Gene architecture has featured heavily in the supercomputers for both speed and efficiency. As it shares Eurora’s space, we got to look at Fermi also, and have included it in the gallery below.
Euro Story: we publish selected stories from across NetMediaEurope’s network of European sites. This week’s story is by Mario De Ascentiis of TechWeekEurope Italia. It was translated and updated by Peter Judge.