next up previous
Next: Sending and receiving Up: first.c in detail Previous: first.c in detail

Environmental enquiries

 

After MPI has been initialised with MPI_Init we find out about the number of processes available for computation. That number is returned in pool_size after the call to

MPI_Comm_size ( MPI_COMM_WORLD, &pool_size );

The first argument to MPI_Comm_size is MPI_COMM_WORLD. This is the default communicator, i.e., a group of processes which comprises all processes in the pool. All MPI operations occur within the communicators. We will discuss this concept in more detail later.

Now, note that every process carries out all these operations on its own. So that when the next enquiry is made:

MPI_Comm_rank ( MPI_COMM_WORLD, &my_rank );
each process ends up with a different number in my_rank. That number is the process identity within the communicator MPI_COMM_WORLD. Processes can have several rank numbers corresponding to different communicators. The rank number allows processes to find out which is which, and what is their own place in the pool.

Those three calls

MPI_Init ( &argc, &argv );
MPI_Comm_size ( MPI_COMM_WORLD, &pool_size ); 
MPI_Comm_rank ( MPI_COMM_WORLD, &my_rank );
are a standard beginning of every MPI program. They are not wrapped up into one call though, because the functions MPI_Comm_size and MPI_Comm_rank may have to be called again, whenever you create a new communicator.

In MPI each processor must also have a name. On a farm this name is simply the host name of the machine the process runs on. On a tightly coupled multiprocessor, such as Cray T3D or the Connection Machine, it may be the machine number of the processor, for example something like node_17. To find out about the name of the processor the function MPI_Get_processor_name is called:

MPI_Get_processor_name ( my_node_name, &node_name_length );
We must reserve appropriate amount of space for my_node_name. BUFSIZ, which is either 512 bytes or 1024 bytes should do for even the most extravagantly named machines.

Now we check if our process pool has a predefined host. This is done with the call

MPI_Attr_get ( MPI_COMM_WORLD, MPI_HOST, (void**) &host_rank_ptr,
               &found_flag );
This function checks if an attribute defined by MPI_HOST exists. If it does, found_flag is set to 1, otherwise found_flag is zero. Finding that the attribute exists, does not mean that there is a host. The answer may still be MPI_PROC_NULL. Here we don't inspect the flag, because the attribute MPI_HOST is one of the three attributes which must be attached to the MPI_COMM_WORLD communicator when MPI is initialised. The other two attributes are MPI_TAG_UB and MPI_IO. The first one refers to the upper bound for the message tag value (more about message tags soon). In principle we should check that too, but the upper bound must be at least 32,767, so as long as our message tags are going to be below that, we're safe. The second attribute, MPI_IO will tell us which processes can support I/O.

When the function MPI_Attr_get returns, we first check what the returned pointer host_rank_ptr points to. If it points to MPI_PROC_NULL it means that there is no host. Otherwise there is a host, and its rank number within the MPI_COMM_WORLD can be found by inspecting the first entry in the host_rank_ptr array.

The MPI environmental management functions are very important, yet they seem to have been neglected by many early implementations of MPI to the extent that some of them used to return a dangling pointer which would crash the system the moment you asked

if ( (int) host_rank_ptr != MPI_PROC_NULL ) ...
You can find more about MPI environmental management in Chapter 7 (page 189) of ``MPI: A Message-Passing Interface Standard''. The function MPI_Attr_get is described on page 169 of the Standard in section 5.7.

In the case of the LAM engine there will not be a predefined host. So, we still have to continue our exploration of the MPI environment.

We continue by checking which processes can support I/O. We repeat the call to MPI_Attr_get, but the attribute this time is MPI_IO.

MPI_Attr_get ( MPI_COMM_WORLD, MPI_IO, (void **) &host_rank_ptr,
               &found_flag );
When the call comes back we again check what host_rank_ptr points to. As we remarked in the synopsis MPI_PROC_NULL is too terrible to contemplate, but it may turn out, as indeed it will for UNIX farms, that all processes can do I/O. In this case host_rank_ptr will point to MPI_ANY_SOURCE.

If only some processes can do I/O, there is little choice left. The ranks of those processes will be returned in the array host_rank_ptr, and in first.c we choose the first of those processes. That will work, for example, under MPICH, because under MPICH process number 0 is always the one talking to your VDU:

if ( (int) host_rank_ptr != MPI_ANY_SOURCE ) host_rank = *host_rank_ptr;

In general you will have to be very careful at this stage. This first process may not be the one which has access to your VDU. In LAM, for example, a process running on the machine from which LAM was started talks to the VDU, and that process will have a rank corresponding to the position that machine has in /opt/lam/boot/bhost.def.

If the returned pointer is MPI_ANY_SOURCE, it means that all processes can do I/O, although not necessarily to your VDU. In this case first.c gives up and simply transfers the name of the machine from the command line to console_name:

else {
   strcpy ( console_name, argv[1] );
   if ( 0 == strcmp ( my_node_name, console_name ) ) {
      host_rank = my_rank;
      i_am_the_master = B_TRUE;
   }
}
Now every process compares that name to its own name, my_node_name, and the one for which this condition is satisfied sets i_am_the_master to B_TRUE, whereas for all other processes it remains, as it had been initialised originally, B_FALSE.

The procedure described in this section is by no means perfect. I can imagine several situations in which it will fail. If you work with LAM you may be tempted to skip all this mess with MPI_Attr_get and take the name of the node from the command line at the very beginning. If you work with MPICH you know that the process you are looking for will always have rank 0.

But knowing about environmental enquiries you can also design quite an elaborate procedure which will check for the availability of predefined hosts, I/O, etc, and make appropriate choices using whatever other information it can obtain.



next up previous
Next: Sending and receiving Up: first.c in detail Previous: first.c in detail



Zdzislaw Meglicki
Tue Feb 28 15:07:51 EST 1995