The initialisation is trivial.
Once the vector b
and the matrix a[i][j]
are initialised,
the master broadcasts the vector to all workers using
MPI_Bcast(b, ROWS, MPI_INT, host_rank, MPI_COMM_WORLD);This ``initialises'' the workers.
Then the master process sends a row of the matrix to each of the workers. The number of the row is conveyed as the message tag:
for (j = 0; j < COLS; j++) int_buffer[j] = a[count][j]; MPI_Send(int_buffer, COLS, MPI_INT, destination, count, MPI_COMM_WORLD);In this manner, the master sends the initial batch of the tasks to all workers.
Now we have the for
loop, which collects replies from all
workers for all rows, both the ones that have been sent already
and the ones that will be sent within this loop.
The master process waits for any message with any tag from any source.
The rank of the sender process and the row of the matrix the answer
refers to are obtained from structure status
:
MPI_Recv (int_buffer, BUFSIZ, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status); sender = status.MPI_SOURCE; row = status.MPI_TAG;This is a new aspect of
MPI_Recv
which we haven't covered yet.
The structure MPI_Status
is discussed in section
3.2.5,
page 20
of ``MPI: A Message-Passing Interface Standard''. This structure
contains two mandatory fields named MPI_SOURCE
and
MPI_TAG
, and may also contain additional fields. The
length of the received message can be also extracted from
MPI_Status
using function MPI_Get_count
. The latter
is most useful in combination with another function
MPI_Probe
, which can be used to obtain some information
about the message, and allocating appropriate amount of space
for the message, before the message is actually read into the buffer.
MPI_Probe
is discussed in section
3.8
of MPI Standard.
After having read and processed the answer, the master checks if the whole matrix has been finished
if (count < ROWS) {...and if there is still some work to be done, another row of the matrix is sent to the same worker:
for (j = 0; j < COLS; j++) int_buffer[j] = a[count][j]; MPI_Send(int_buffer, COLS, MPI_INT, sender, count, MPI_COMM_WORLD);Otherwise, i.e., if the matrix has already been finished, the master sends to the worker a null message with tag
ROWS
. Because
in C the matrix will never have a row number ROWS
, and
because all workers know about the dimensions of the matrix, receiving
a message with that tag is a signal to the worker that it's time
to exit:
MPI_Send(0, 0, MPI_INT, sender, ROWS, MPI_COMM_WORLD);
In ``Using MPI...'' you will find that the termination signal in their example is 0. But if we did so here, we would terminate process rank 0 the moment we send the zeroth row of the matrix to it. The master process would then hang, because process rank 0 would not have sent the result of multiplication of the zeroth row of the matrix by vector b back to the master.
There are two important differences between the example from ``Using MPI...'' and our example. First, their example is in Fortran, and in Fortran, unlike in C, the first row of a matrix has index 1, not 0. The second difference is that in ``Using MPI...'' the master process always has a rank 0, whereas in our case it can have any rank, depending on the machine you start your LAM engine from.