China's Top500 leader with their own 260-core CPU design
Aug 03, 2016
A system built with the same peak flops using GPUs would actually be more useful than what they built. The Sunway TaihuLight processing elements have 64KB scratchpad, no data cache, and have less memory bandwidth than Nvidia Fermi (7 years old.)For most applications the processing elements will be so starved for data that they are just an idle waste of energy.
http://www.netlib.org/utk/people/JackDongarra/PAPERS/sunway-...
Jun 20, 2016
For technical information, read the paper: http://www.netlib.org/utk/people/JackDongarra/PAPERS/sunway-...Compared to Tianhe-2, the previous top system, 2.7x more flops, and 3.2x more flops per watt.