Crisis in HPC Personal Comment - Roger Evans, RAL

It was alarming to see the range of information and dis-information that had given rise to the idea of a crisis. Clearly many people had expected greater things of machines such as the Cray T3D but its limitations are clearly understood by those around the machine and are in the memory characteristics of each node rather than any part of the parallel architecture. Inter processor bandwidth and latency of the T3D are perfectly adequate for most applications although the need to use Cray specific constructs to achieve good performance is less than ideal.

Many people are also disappointed that parallel computing has not gained greater acceptance but this ignores the phenomenal growth in the power of desktop machines to the point where a few processors in SMP configuration provide as much power as most users need. For the few who are the most demanding, larger scale parallelism must be adressed and the rewards in the range of science that can be tackled are great enough that some machine specific optimisation is acceptable.

The reasons for disappointment with the latest generation of MPPs are that despite a range of architectures from different suppliers none of them is as well balanced as the old transputer or the vector-parallel machines such as the Cray Y-MP that they have superceded. Users who were used to achieving 50% of peak performance suddenly found that it is impossible to do better than 20-25% of peak and the order of magnitude performance improvement suddenly becomes a factor of three or so.

Within the UK, the crisis in HPC must surely be that the level of funding forces us to choose between a single top-end machine that is only applicable to a few subject areas and several mid-range machines that cause us as a nation to lose touch with the highest performance hardware.

It would be nice to think that some enterprising manufacturer would come up with a 300 Mflop/s chip with balanced communications and context switch times so that the next generation of parallel language would find an efficient home on which to run and bring us a single architecture and programming model that scaled from laptop computers to Teraflops but the scale of investment needed is enormous. (The closing comments at the meeting referred to an announcement that Sandia Laboratory in the USA is to host an Intel computer of 1.8 Tflop/s comprising 9000 Intel P6 processors - perhaps this dream is closer than we think!)

Roger Evans (r.g.evans@rl.ac.uk)