[Santa Clara University]
Department of Mathematics
and Computer Science
[Return to Math 169 Homepage]

Math 169 Notes -- MPI


Contents


A. Sample MPI Codes

B. Parallel Computing and Languages

There is a limit to the speedup that can be achieved by faster hardware. There is also a limit to the speedup that can be achieved by better algorithms in the style that we have created them in the past. Since at least the early-1970s work has been done on speeding-up programs via parallel processing.

The standard computer is said to be based on a "von Neumann" architecture, that is, a sequential execution of commands. It models a single person working on a task individually. The person (usually!) cannot do two or more things at a time. The person must do all tasks sequentially, completing one task before beginning on the next. The is the way that standard computers work and this is the way most programmers think in terms of coding programs. But this is not the only option.

Certain degrees of parallelism already exist in what have commonly been called "super-computers." For example, Crays are vector machines, i.e., a vector-scalar multiplication might take only a long as a scalar-scalar multiplication, since each element of the vector (loaded into a vector register) is simultaneously (i.e., in parallel) multiplied by the same scalar together.

The existence of multiple processor machines and the ability to link several (networked) machines together (sometimes termed "distributed processing") to allow major segments of code to be executed concurrently poses two major challenges:

  1. revising (and inventing) languages (or language supplements) to match the power of newer machines or clusters of networked machines; and
  2. developing new algorithms to correspond to a "team" approach to problem-solving.

Algol-68 included variations to allow concurrent execution of statements. It is said that the use of the semi-colon statement ending mark was meant to indicate non-concurrent execution of the previous statement, but that a comma could be used alternatively if the previous statement could be done at the same time as the following statement.

Ada explicitly included the possibility of concurrent execution in its development.

IV-tran was a dialect of Fortran developed for the first major supercomputer, the Illiac IV, and it included statements such as FORK and JOIN to indicate when independent sections of code began and ended.

Concurrent Pascal was a superset of Pascal developed by Per Brinch-Hansen around 1975. It included concepts such as "monitors," a way to encapsulate data and also prevent access by more than one concurrently running process at a time.

Path-Pascal was a different superset of Pascal developed by Roy H. Campbell and Robert B. Kolstad in the late 1970s at the University of Illinois (Urbana-Champaign). Path-Pascal included the concepts of objects (more like C++ definitional "classes"), processes, and path expressions.

In Path-Pascal, an object can encapsulate data and subprograms, and some of these subprograms can be processes which differ from procedures and functions in that multiple copies of processes can be made to run simultaneously. A path expression is a structure which enables the programmer to specify the synchronization of subprograms within objects and limit the number of instantiated processes.

Fortran-95 has included the FORALL statement specifically for use on parallel machines and included other attributes to help the compiler create appropriate code.

An alternative approach to creating a completely new language has been to devise "add-ons" to existing languages. In a sense, HPF (High Performance Fortran) is an attempt to provide additional (compiler) directives for Fortran so the programming can play a role in determining how code is to be distributed to different processors for simultaneous execution.

Another version of such "add-ons" is "Message Passing Interface" or "MPI," a library of functions/subroutines for use with several languages. Work on MPI began in April, 1992 with the final draft make public in May, 1994. (See links on the Notes Listing Page for references to MPI sites and tutorials.)

Additional Notes about Parallel Computers

The classical von Neumann computer is sometimes classified as a single-instruction single data (SISD) machine. A program has a single instruction stream and a single data stream and data is moved from a single memory to the registers in the CPU.

At the other "extreme" is a multiple-instruction multiple data (MIMD) machine, which can be envisioned as a collection of autonomous processors controlled by separate instruction streams, each operating on its own data streams.

Between these two models, it is possible to have single-instruction multiple data (SIMD) and multiple-instruction single data (MISD). A "pure" SIMD system has a single CPU which controls a large collection of subordinate ALUs (arithmetic-logic unit), each of which can operate on different data (e.g., different elements of an array) simultaneously.

MIMD machines can also consist of shared-memory architectures or distributed-memory architectures. Both of these architectures raise issues of communication between processors when data being generated on one processor is needed by another machine.

One variant of MIMD programming is called single-program, multiple data (SPMD). Whereas with SIMD a single instruction is applied to different data, with SPMD different instructions may be applied to different data but these instructions are all within a single program but "assigned" to different processors by means of conditional branches (i.e., IF ... ELSE statements) in the program code. One still needs to worry about how to communicate between different processors, however. The basic way to write a program in the SPMD paradigm is to envision a single program that, after learning the identification number associated with each processor, can spread out tasks to each processor to be done simultaneously.

HPF is an example of a language (extension) written for SIMD machines (e.g., the FORALL loop). MPI is primarily for SPMD/MIMD.

C. Basic MPI Functions

NOTE: A good MPI Tutorial for downloading can be found at this link (PS file -- 4 slides to a page).

Each program using MPI functions must include the two functions that indicate the beginning and ending of the MPI section of the code. In addition, the compiler must be told where to find the MPI functions.

The first basic function is MPI_Init. Before the invocation of this function, there can be no calls to any other MPI function.

The other function is MPI_Finalize. After the invocation of this function, there can be no calls to any other MPI function.

The syntax of these (and other functions) varies as to which language you are using.

Code "Importation"

Language "Import" Line and Miscellaneous
C #include "mpi.h"

must also have the
main header as:
main(int argc, char* argv[])

C++ #include "mpi++.h"

must also have the
main header as:
int main(int argc, char* argv[])

Fortran-77 include 'mpif.h'

must also declare ierr and rc
as integer to return possible error
values from function calls.*

Fortran-90 use mpi

must also declare ierr and rc
as integer to return possible error
values from function calls.*

*NOTE: In Fortran MPI function calls, rc ("return code") and ierr ("integer-valued error") return a value of 0 if the routine has completed successfully or -1 if an error has occurred.

For Initialization:

Language Sample Call
C MPI_Init(&argc, &argv);
C++ MPI::Init(argc,argv);
Fortran-77 or
Fortran-90
call MPI_INIT( ierr )

For Finalization:

Language Sample Call
C MPI_Finalize();
C++ MPI::Finalize();
Fortran-77 or
Fortran-90
call MPI_Finalize( rc )
After the initialization process is complete (via the MPI_INIT function call), usually two other functions/subroutines are invoked. MPI_Comm_size determines the number of processors that are currently active for use by the program. MPI_Comm_rank determines the identification number associated with each of the processors running this same code simultaneously.

Both of these functions/subroutines use the parameter (or identifier) "MPI_COMM_WORLD." In C, this is construed as a special datatype called a communicator (or communicator handle) and identifies the collection of processes that can send messages to each other. (Communicators provide a mechanism for identifying process subsets and for ensuring that messages intended for different purposes are not confused.) In C++. this is a class in the namespace (class) MPI which itself has member functions. In Fortran-77/90, this is construed as an INTEGER with the same function.

For Determining Number of Processors: (Return value is stored in the integer variable numprocs.)

Language Sample Call
C MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
C++ int numprocs = MPI::COMM_WORLD.Get_size();
Fortran-77 or
Fortran-90
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr )

For Determining Rank (Individual Identification Number): (Return value is stored in the integer variable myid.)

Language Sample Call
C MPI_Comm_rank(MPI_COMM_WORLD, &myid);
C++ int myid = MPI::COMM_WORLD.Get_rank();
Fortran-77 or
Fortran-90
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
Sample Program Run
The following code (the C++ sample code linked above), uses only the four functions in this section. The output below is from an execution with 4 processors active,
// -*- c++ -*-
//
// Copyright 1997, University of Notre Dame.
// Authors: Andrew Lumsdaine, Michael P. McNally, Jeremy G. Siek,
//          Jeffery M. Squyres.
//

#include <iostream.h>
#include "mpi++.h"

int
main(int argc, char *argv[])
{
  MPI::Init(argc, argv);
  
  int rank = MPI::COMM_WORLD.Get_rank();
  int size = MPI::COMM_WORLD.Get_size();
  
  cout << "Hello World! I am " << rank << " of " << size << endl;
  
  MPI::Finalize();
}
The following is the output from the execution of this program specifying the "simulation" of 5 processors. NOTE: All processors (numbered 0 through 4) write out the same statement, each using its own number. NOTE ALSO: The order of execution of the processors is random.)
math 26: script
Script started on Wed May 10 21:22:50 2000
math 27: mpirun -np 5 hello++
Hello World! I am 4 of 5
Hello World! I am 1 of 5
Hello World! I am 0 of 5
Hello World! I am 2 of 5
Hello World! I am 3 of 5
math 28: exit
exit

script done on Wed May 10 21:23:50 2000
Comment: This example show a sample code that has only one executable (output) statement in it but an output of 5 different lines, each coming from a "different" processor as part of the same program. Since there was no distinction as to which processor did what, they all did what the program requested them to do!

Theoretically, code such as this could cause problems with the output device (terminal) if two processors wanted to write to the output device at the very same moment. It could lead to garble messages (or worse). This sort of situation has led to the use of software "semaphores" to indicate when a "single user" device is in use.

D. Intermediate MPI Functions

There are four intermediate MPI functions which enable most programs to write many programs. They deal with communication between one process and another. They are used in pairs: EXPLANATION OF COMMONLY-USED PARAMETERS
MPI_INTEGER (or MPI_REAL or other appropriate MPI datatype name corresponding to the language used) -- the datatype associated with the value(s) being sent; (See Table below for commonly used datatypes and standard MPI indicator for language being used.)
MPI_COMM_WORLD -- the communicator associated with the set of processors involved in the activity of the function;
ierr -- the error status return value (only in Fortran)
POINT-TO-POINT COMMUNICATIONS

For Sending

Language Sample Call
C MPI_Send(sendb, iscnt, MPI_INT, idest, tag, MPI_COMM_WORLD);
C++ MPI::COMM_WORLD.Send(sendb, iscnt, MPI_INT, idest, tag);
Fortran-77 or
Fortran-90
call MPI_SEND(sendb, iscnt, MPI_INTEGER, idest, tag, MPI_COMM_WORLD, ierr)
Explanation of Parameters:
sendb -- "send buffer" -- the identifier in the sending processor associated with MPI_COMM_WORLD whose value is beint sent;
iscnt -- the (integer) count of "sent" values associated with sendb;
idest -- the "rank" (integer identification number) of the destination process;
tag -- a non-negative integer that can be used to identify the message; MPI_ANY_TAG (or, for C++, MPI::ANY_TAG) can be used as a wildcard value.

For Receiving

Language Sample Call
C MPI_Recv(sendb, iscnt, MPI_INT, idsource, tag, MPI_COMM_WORLD, &status);
C++ MPI::COMM_WORLD.Recv(sendb, iscnt, MPI_INT, idsource, tag, &status);
Fortran-77 or
Fortran-90
call MPI_RECV(sendb, iscnt, MPI_INTEGER, idsource, tag, MPI_COMM_WORLD, status, ierr )
Explanation of Parameters:
sendb -- "send buffer" -- the identifier in the sending processor associated with MPI_COMM_WORLD whose value is being received;
iscnt -- the (integer) count of "sent" values associated with sendb;
isource -- the "rank" (integer identification number) of the source process;
tag -- a non-negative integer that can be used to identify the message; MPI_ANY_TAG (or, for C++, MPI::ANY_TAG) can be used as a wildcard value;
status -- in C/C++ this is a structure with fields MPI_SOURCE and MPI_TAG containing source and tag information; in Fortran it is an array of integers of size MPI_STATUS_SIZE with contants MPI_SOURCE and MPI_TAG indexing the source and tag fields. In most elementary programs, this is ignored, though the variable must be properly declared. In C, it is declared to be of type MPI_Status, and in C++ of type MPI::Status. In Fortran, it is declared to be an integer array in this way: INTEGER STATUS(MPI_STATUS_SIZE).

For a sample Fortran-90 program using MPI_SEND and MPI_RECV, see this link. An alternative program, also using MPI_Reduce is given at this link (or this link for a slightly condensed version). Another sample program is given at this link.

COLLECTIVE COMMUNICATIONS

For Broadcasting

Language Sample Call
C MPI_Bcast(&sendb, iscnt, MPI_INT, iroot, MPI_COMM_WORLD);
C++ MPI::COMM_WORLD.Bcast(&sendb, iscnt, MPI_INT, iroot);
Fortran-77 or
Fortran-90
call MPI_BCAST(sendb, iscnt, MPI_INTEGER, iroot, MPI_COMM_WORLD, ierr )
Explanation of Parameters:
sendb -- the identifier whose value(s) will be passed to all the other processes associated with MPI_COMM_WORLD;
iscnt -- the (integer) number of values associated sendb, either 1 for a scalar variable or the total number of values associated with an array;
iroot -- the "rank" (integer identification number) of the broadcast root, usually 0;

For Reducing

Language Sample Call
C MPI_Reduce(&sendb, &recb, iscnt, MPI_INT, MPI_SUM, iroot, MPI_COMM_WORLD);
C++ MPI::COMM_WORLD.Reduce(&sendb, &recb, iscnt, MPI::INT, MPI::SUM, iroot);
Fortran-77 or
Fortran-90
call MPI_REDUCE( sendb, recb, iscnt, MPI_INTEGER, MPI_SUM, iroot, MPI_COMM_WORLD, ierr )
Explanation of Parameters:
sendb -- "send buffer" -- the identifier in each processor associated with MPI_COMM_WORLD whose value contributes via the specified operation (indicated here by MPI_SUM) to the value passed to recb;
recb -- "receive buffer" -- the identifier whose value is based on the individual sendb values computed in each processor according to the specified operation;
iscnt -- the (integer) count of "send" values associated with sendb;
MPI_SUM a typical operation designated to "reduce" the multiple values in the various copies of sendb into one value returned in recb. See table below for other possible "reduction" operations;
iroot -- the "rank" (integer identification number) of the broadcast root, usually 0;
MPI Reduction Operators
C/FORTRAN Operation Name Meaning
MPI_MAX Maximum
MPI_MIN Minimum
MPI_PROD Product
MPI_SUM Sum
NOTE: C++ operations are indicated in the style of MPI::SUM.
Other operations are available, such as logical and/or/xor and bitwise and/or/xor. See another more complete references.

For a sample Fortran-90 program using MPI_BCAST and MPI_REDUCE, see this link.

MPI Datatypes

C Name C++ Name Fortran Name
MPI_INT MPI::INT MPI_INTEGER
MPI_FLOAT MPI::FLOAT MPI_REAL
MPI_DOUBLE MPI::DOUBLE MPI_DOUBLE_PRECISION
MPI_COMPLEX
MPI_CHAR MPI::CHAR MPI_CHARACTER
Other MPI datatypes are available, such as logical, short, long, unsigned. See another more complete references.

E. Advanced MPI Functions

A complete listing of MPI 1.2 Functions and their bindings for C, C++ and Fortran can be found at this link.


This page is maintained by Dennis C. Smolarski, S.J. dsmolarski at math.scu.edu
Last updated: 18 October 2007.