The remainder of the program evaluates forces between particles.
Each process evaluates forces acting only on its own particles.
But the forces are generated by all other particles. Thus the outer
loop scans only over
(i = my_offset; i < my_offset + particle_number)
,
whereas the inner loop scans over
all particles:
(j = 0; j < total_particles)
.
Observe that this operation is timed. But when the master process
writes on standard output ``done my job...''
it only quotes
its own CPU time.
There is a barrier, MPI_Barrier(MPI_COMM_WORLD)
, near the
end of the program. All processes must wait at the barrier until
the last process finishes. Only then the final printf
statement
is executed. This allows us to time the difference between the slowest
process and the master process in our computation.