Experiences on the CS-2: Parallel Discrete Event Simulation

Christopher Booth
Parallel Processing Section
DRA Malvern 
St Andrews Road 
Malvern.  WR14 3PS
UK 

Email:  cjmb@signal.dra.hmg.gb
Fax:    +44 (0)1684 894389
Tel:    +44 (0)1684 896400
    
We are interested in large, complex simulation models with long sequential run times, which pose the greatest challenge to parallel discrete event simulationists. We use an optimistic synchronisation protocol, which allows speculative execution of events, and uses checkpointing in order to recover from causality errors.

Early development of our simulators was carried out with a fully connected, stochastic closed queueing network (CQN), which allowed us to test the simulator for correctness. Working with this synthetic simulation, we were able to achieve about 70% parallel efficiency relative to an efficient sequential simulator. The CQN was not a good test for a larger, more complex application, for which 10-15% parallel efficency is a more common figure. We believe, therefore, that the main reasons for the poor speed up are algorithmic, not architectural. Areas where we believe that there is room for improvement include memory management, load balance and understanding the transient behaviour of the task graph. It is also important for us to understand locality issues, since memory is used sparsely and very heavily, both of which affect cache effectiveness.

Algorithmic improvements we would like to see include a parallel simulation language that is able to take a global view of the application, so that the compiler can interact sensibly with the runtime system. We also believe that dynamic load balancing would be helpful.

Our emphasis on algorithmic improvements does not mean that we would not welcome architectural improvements, specifically a global address space, which would remove our need for a costly communications harness implemented in software, and an improved user-level threads capability, allowing the processor to respond to the arrival of inter-processor messages in a timely manner. The effectiveness of reduced communications latencies must make it attractive to everyone.