Recall from Chapter 2 that the need for asynchronous communication can arise when a computation must access elements of a shared data structure in an unstructured manner. One implementation approach is to encapsulate the data structure in a set of specialized data tasks to which read and write requests can be directed. This approach is not typically efficient in MPI, however, because of its MPMD programming model.
As noted in Section 2.3.4, an alternative implementation approach is to distribute the shared data structure among the computational processes, which must then poll periodically for pending read and write requests. This technique is supported by the MPI_IPROBE function, which is described in this section along with the related functions MPI_PROBE and MPI_GET_COUNT. The three functions are summarized in Figure 8.6.
Figure 8.6: MPI inquiry and probe operations.
The MPI_IPROBE function checks for the existence of pending messages without receiving them, thereby allowing us to write programs that interleave local computation with the processing of incoming messages. A call to MPI_IPROBE has the general form MPI_IPROBE(source, tag, comm, flag, status)
and sets a Boolean argument flag to indicate whether a message that matches the specified source, tag, and communicator is available. If an appropriate message is available, flag is set to true; otherwise, it is set to false. The message can then be received by using MPI_RECV. The receive call must specify the same source, tag, and communicator; otherwise, a different message may be received.
Related to MPI_IPROBE is the function MPI_PROBE, which blocks until a message of the specified source, tag, and communicator is available and then returns and sets its status argument. The MPI_PROBE function is used to receive messages for which we have incomplete information.
The status argument constructed by an MPI_RECV call, an MPI_PROBE call, or a successful MPI_IPROBE call can be used to determine the (pending) message's source, tag, and size. The inquiry function MPI_GET_COUNT yields the length of a message just received. Its first two (input) parameters are a status object set by a previous probe or MPI_RECV call and the datatype of the elements to be received, while the third (output) parameter is an integer used to return the number of elements received (Figure 8.6). Other information about the received message can be obtained directly from the status object. In the C language binding, this object is a structure with fields MPI_SOURCE and MPI_TAG. Thus, status.MPI_SOURCE and status.MPI_TAG contain the source and tag of the message just received. In Fortran, the status object is an array of size MPI_STATUS_SIZE, and the constants MPI_SOURCE and MPI_TAG are the indices of the array elements containing the source and tag information. Thus, status(MPI_SOURCE) and status(MPI_TAG) contain the source and tag of the message just received.
The following code fragment use these functions to receive a message from an unknown source and containing an unknown number of integers. It first detects arrival of the message using MPI_PROBE. Then, it determines the message source and uses MPI_GET_COUNT to determine the message size. Finally, it allocates a buffer of the appropriate size and receives the message.
int count, *buf, source; MPI_Probe(MPI_ANY_SOURCE, 0, comm, &status); source = status.MPI_SOURCE; MPI_Get_count(status, MPI_INT, &count); buf = malloc(count*sizeof(int)); MPI_Recv(buf, count, MPI_INT, source, 0, comm, &status);
The Fock matrix construction algorithm of Section 2.8 allocates to each processor a data task, which manages part of the D and F matrices, and a computation task, which generates requests for matrix elements. The two tasks execute concurrently, with the data task responding to requests for data and the computation task performing computation. Briefly, the two tasks are defined as follows.
/* Data task */ /* Computation task */while(done != TRUE) { while(done != TRUE) {
receive(request); identify_next_task();
reply_to(request); generate_requests();
} process_replies();
}
A polling version of this program integrates the functions of the database and computation tasks into a single process, which alternates between checking for pending data requests and performing computation. This integration can be achieved as in Program 8.6. The program uses the MPI_IPROBE function to determine whether database messages are pending. If they are, these messages are processed before further computation is performed.
For simplicity, the procedure process_request deals with a single type of request: a read operation on a single array element. A process receiving such a request determines the source of the message, retrieves the requested value, and returns the value to the source process.
© Copyright 1995 by Ian Foster