Using LAM-MPI
The Message-Passing Interface, or MPI for short, is a library of routines for running jobs across multiple nodes in a Beowulf cluster.
More detailed tutorials are available from LAM-MPI.org.
If you're planning on doing anything serious with LAM-MPI, you should probably read the user manual, which includes information about writing, building, and running programs on a LAM-MPI cluster.
Before you can use the MPI, you must start the service running on the nodes you plan to use.
Starting LAM-MPI
Starting MPI requires you to have SSH keys set up and an SSH agent running so that you don't have to type passwords to connect to different machines. See our SSH tutorial for the details; once you've got SSH working, come back here and try the examples on this page.
Log into the dworkin and start an SSH agent
process. Add your key, then run the recon command to
check the nodes:
dworkin% recon -v node_list_file
where node_list_file is a file containing
the names of the nodes you want to boot and a list of CPUs available on that
host. For these systems, you can try setting cpu=2, as
they have “hyperthreading” processors. I'm not sure whether
hypterthreading gives us any increase in performance over
nonhyperthreading with these systems when using MPI, but it's worth
trying to see.
A file containing a list of all the possible nodes can be found
in /etc/lam/lam-full.bhost on dworkin.
If recon succeeds, try running lamboot
to really start the cluster:
dworkin% lamboot -v /etc/lam/lam-full.bhost
If all goes well, you should see something like
LAM 7.1.1/MPI 2 C++/ROMIO–Indiana University n-1<15868> ssi:boot:base:linear: booting n0 (benedict.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n1 (brand.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n2 (caine.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n3 (corwin.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n4 (deirdre.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n5 (delwin.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n6 (dworkin.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n7 (fiona.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n8 (flora.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n9 (gerard.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n10 (julian.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n11 (martin.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n12 (merlin.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n13 (oberon.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n14 (random.math.hmc.edu) n-1<15868> ssi:boot:base:linear: booting n15 (sand.math.hmc.edu) n-1<15868> ssi:boot:base:linear: finished
Listing Active Nodes
The lamnodes command will generate a list of nodes
that are part of your LAM cluster. Output should look something
like
XOXOX
Monitoring LAM Processes
The mpitask and mpimsg commands allow
you to monitor different aspects of the LAM system.
mpitask shows information about tasks running on the
system, whereas mpimsg shows information about the
messages being passed from node to node through the system.
“Cleaning” LAM
In between running processes with mpirun, you should
run the lamclean program to “clean up” any leftover
processes, memory uses, and so forth.
Ending Your LAM Session
To close down your LAM session on all nodes, run
dworkin% lamhalt -v
If lamhalt fails (because one or more nodes have
crashed, for example), you may need to use the wipe
tool to attempt to connect to each node from your original
lamboot and shut down any LAM processes running on
those nodes. The wipe takes a filename containing node
names as its argument (just like lamboot).
Some Code Examples
The following code examples are taken from the LAM-MPI user manual.
C
Copy the following code into a file called hello.c:
#include <stdio.h>
#include "mpi.h"
int main(int argc, char *argv[]) {
int myrank, mysize;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&mysize);
MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
printf("Hello World! I am %i of %i\n", myrank, mysize);
MPI_Finalize();
}
Compile and link the program
dworkin% mpicc hello.c -o hello.mpi
Now run the program:
dworkin% mpirun -np 64 hello.mpi
(The -np flag specifies the
number of nodes to use; -np n tells mpirun
to run n copies of the program on the
available nodes; see the manpage for more flags.)
If all goes well, you should see output similar to the following:
Hello World! I am 0 of 64 Hello World! I am 8 of 64 Hello World! I am 24 of 64 Hello World! I am 16 of 64 Hello World! I am 2 of 64 Hello World! I am 4 of 64 Hello World! I am 10 of 64 Hello World! I am 20 of 64 Hello World! I am 18 of 64 Hello World! I am 6 of 64 Hello World! I am 26 of 64 Hello World! I am 22 of 64 Hello World! I am 14 of 64 Hello World! I am 30 of 64 Hello World! I am 12 of 64 Hello World! I am 28 of 64 Hello World! I am 41 of 64 Hello World! I am 57 of 64 Hello World! I am 49 of 64 Hello World! I am 34 of 64 Hello World! I am 53 of 64 Hello World! I am 50 of 64 Hello World! I am 38 of 64 Hello World! I am 42 of 64 Hello World! I am 37 of 64 Hello World! I am 54 of 64 Hello World! I am 1 of 64 Hello World! I am 63 of 64 Hello World! I am 58 of 64 Hello World! I am 46 of 64 Hello World! I am 60 of 64 Hello World! I am 40 of 64 Hello World! I am 56 of 64 Hello World! I am 48 of 64 Hello World! I am 36 of 64 Hello World! I am 39 of 64 Hello World! I am 43 of 64 Hello World! I am 52 of 64 Hello World! I am 55 of 64 Hello World! I am 35 of 64 Hello World! I am 47 of 64 Hello World! I am 51 of 64 Hello World! I am 15 of 64 Hello World! I am 59 of 64 Hello World! I am 62 of 64 Hello World! I am 3 of 64 Hello World! I am 33 of 64 Hello World! I am 11 of 64 Hello World! I am 19 of 64 Hello World! I am 7 of 64 Hello World! I am 23 of 64 Hello World! I am 27 of 64 Hello World! I am 9 of 64 Hello World! I am 61 of 64 Hello World! I am 17 of 64 Hello World! I am 25 of 64 Hello World! I am 31 of 64 Hello World! I am 5 of 64 Hello World! I am 21 of 64 Hello World! I am 29 of 64 Hello World! I am 32 of 64 Hello World! I am 45 of 64 Hello World! I am 44 of 64 Hello World! I am 13 of 64
(The order that the nodes report back in may be different.)
C++
Copy the following code into a file called hello.cc:
#include <iostream>
#include <mpi.h>
using namespace std;
int main(int argc, char *argv[]) {
int rank, size;
MPI::Init(argc, argv);
rank = MPI::COMM_WORLD.Get_rank();
size = MPI::COMM_WORLD.Get_size();
cout << "Hello, world! I am " << rank << " of " << size << endl;
MPI::Finalize();
return 0;
}
Compile and link the program:
dworkinmpiCC hello.cc -o hello++.mpi
or
dworkinmpic++ hello.cc -o hello++.mpi
Run the program with
dworkin% mpirun -np 8 hello++.mpi
If all went well, output should be similar to (order may be different)
Hello World! I am 0 of 8 Hello World! I am 5 of 8 Hello World! I am 1 of 8 Hello World! I am 3 of 8 Hello World! I am 6 of 8 Hello World! I am 4 of 8 Hello World! I am 2 of 8 Hello World! I am 7 of 8
FORTRAN
This tutorial assumes basic knowledge of the UNIX environment. Remember that in fortran, all statements that should be executed must have 6 spaces before it (so 6 spaces then line of code, for every line of code).
Copy the following code to a file called hello.f:
program main
include 'mpif.h'
integer myrank, mysize, rc
call MPI_INIT( ierr )
call MPI_COMM_SIZE(MPI_COMM_WORLD,mysize,ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,myrank,ierr)
print *, 'Hello World! I am ', myrank, ' of ', mysize, '.'
call MPI_FINALIZE(rc)
stop
end
Compile and link the program with
dworkin% mpif77 hello.f -o hellof.mpi
Run it with
dworkin% mpirun -np 8 hellof.mpi
You should see results similar to
Hello World! I am 0 of 8 Hello World! I am 5 of 8 Hello World! I am 1 of 8 Hello World! I am 3 of 8 Hello World! I am 6 of 8 Hello World! I am 4 of 8 Hello World! I am 2 of 8 Hello World! I am 7 of 8



