| <- HREF="node36.html" Prev | Index | Next -> |
NHSE ReviewTM: Comments
· Archive
· Search
Machine type | Distributed-memory multi-vectorprocessor |
---|---|
Models | Computing Surface 2 |
Operating system | Internal OS transparent to the user, SunoS (Sun's Unix variant) on the front-end system |
Connection structure | Multistage crossbar |
Compilers | Extended Fortran 77, ANSI C |
Vendors information Web page | http://www.meiko.com/ |
System parameters:
Model | Computing Surface 2 |
---|---|
Clock cycle | 20 ns |
Theor. peak performance | |
Per Proc. (64 bits) | 200, 40 Mflop/s |
Maximal (64 bits) | 204.8 Gflop/s |
Main memory | <= |
Memory/node | 32-128, 32--512MB |
Communication bandwidth | -- |
No. of processors | 8-1024 PEs |
Remarks:
The CS-2 features 8-1,024 processor elements (PEs) which can be either scalar or vector nodes. Apart from a separate communications module, these PEs contain either a SuperSparc or a SuperSparc + 2 VP vectorprocessors. The speed of a scalar PE is estimated to be 40 Mflop/s (at a 20 ns clock) and 200 Mflop/s for the vector PEs for 64-bit precision. The VP modules are manufactured by Fujitsu. The speed at 32-bit precision is doubled with respect to 64-bit operation and, unlike the earlier Fujitsu VP products, use IEEE 754 floating-point format. The memory has 16 banks and to avoid memory bank conflicts the CS-2 has the interesting option to have scrambled allocation of addresses, thus guaranteeing good access at potential problematic strides 2, 4, etc.
The point-to-point communication speed is 100 MB/s (50 MB/s in each direction). Because the communication happens through multi-level crossbars, called ``layers'' by Meiko, the aggregate bandwidth of the system scales with the number of PEs, with a very respectable latency of 200 ns per layer. As the maximum configuration of the machine contains 1,024 PEs, the theoretical peak performance at 64-bit precision is 200 Gflop/s. It is possible to connect each PE to its own I/O devices to have scalable parallel I/O with the scaling of other resources.
The Portland Group which has won some renown for its excellent i860 compilers has developed the compilers for the CS-2. These include Fortran 77 and ANSI C but also Fortran 90. The current compiler already offers data distribution directives as proposed in [6].
In the USA, the machine will be marketed by Meiko, however, in Europe and the rest of the world marketing is done by Parallel Computing Industries, a consortium of Meiko, Parsys, and Telmat.
Measured Performances: In [2] a speed of 5.0 Gflop/s on a 64 processor CS-2 is reported for the solution of an order 18688 dense linear system. From the NAS parallel benchmarks some results on a 128 processor machine are given for class B problems: EP took 21.16 seconds while 6.52 seconds was measured for the MG problem.
Copyright © 1996 Aad J. van der Steen and Jack J. Dongarra
| <- HREF="node36.html" Prev | Index | Next -> |
NHSE ReviewTM: Comments
· Archive
· Search
NHSE: Software Catalog
· Roadmap