We have already discussed MPI_Bcast
in section 3.4.3.
Thus the functioning of the program up to MPI_Bcast
should be
clear. The computation itself is straightforward. Every process finds
the length of the subinterval, initialises sum
to 0, and adds
up the values of f(x)
which correspond to its subintervals.
The only slighly confusing thing in this computation may be that instead
of working on one simply connected subset of the domain, each process
jumps over pool_size
of other subintervals. Thus the subintervals
a given process works on are interleaved with subintervals belonging to
other processes.
At the end of this computation each process puts its own area under the
curve of f(x) in mypi
. All processes then call
MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, host_rank, MPI_COMM_WORLD);
This operation is discussed in ``Using MPI...'' on page 25, and in ``MPI: A Message-Passing Interface Standard'' on page 111, section 4.9.
All processes which issue this operation, pack their data into
mypi
and send it out to a process whose rank is given by the 6th
argument: host_rank
in our case. The type of data put in
mypi
is given by the 4th argument and the number of data items
in mypi
is given by the third argument.
The process number
host_rank
receives the data from all processes (including
itself), and performs a reduction operation on
that data. The reduction operation is defined by the 5th argument.
Here it is MPI_SUM
, i.e., a summation. The result is placed
in the second argument, pi
. Only processes which belong
to the communicator MPI_COMM_WORLD
, specified by the last
argument to MPI_Reduce
, participate in the transaction.
MPI does not actually specify who is to perform the final summation. On some supercomputers, there may be a special circuitry within the network itself for doing such things. For example on the Connection Machine CM5, integer reductions are performed by the network, whereas floating point reductions are peformed by the destination node. CM5 network reductions are much faster than node reductions. But on the farms, it is almost certain that the reduction will be performed by the destination node.
There is a large number of predefined reduction operations, for example
MPI_SUM
used above, MPI_PROD
, MPI_MIN
,
MPI_MAX
, etc. These are discussed on page 113 of ``MPI: A
Message-Passing Interface Standard'', section
4.9.2.
The programmer can
define her own reduction operations using function
MPI_Op_create
, discussed in section
4.9.4,
page 118.