Installing:
|
|
 |
Make sure you have
a
shared-libraries LAM or OMPI installation
available in all nodes in
your cluster. You might use the command:
$ which mpirun # discover which one comes first in your path ~/openmpi-1.2.3/bin/mpirun # might be this ~/lam-7.1.3/bin/mpirun # or this other
$ ls `laminfo -path libdir | cut -d: -f2` # if using LAM lam liblamf77mpi.so.0.0.0 liblammpi++.so.0 libmpi.la liblam.a liblam.la liblammpi++.so.0.0.0 libmpi.so liblamf77mpi.a liblammpi++.a liblam.so libmpi.so.0 liblamf77mpi.la liblammpi++.la liblam.so.0 libmpi.so.0.0.0 liblamf77mpi.so liblammpio.a liblam.so.0.0.0 liblamf77mpi.so.0 liblammpi++.so libmpi.a
$ ls `ompi_info -path libdir | cut -d: -f2` #if using OMPI libmca_common_sm.la libmpi_f77.so.0 libopen-pal.la libmca_common_sm.so libmpi_f77.so.0.0.0 libopen-pal.so libmca_common_sm.so.0 libmpi_f90.la libopen-pal.so.0 libmca_common_sm.so.0.0.0 libmpi_f90.so libopen-pal.so.0.0.0 libmpi_cxx.la libmpi_f90.so.0 libopen-rte.la libmpi_cxx.so libmpi_f90.so.0.0.0 libopen-rte.so libmpi_cxx.so.0 libmpi.la libopen-rte.so.0 libmpi_cxx.so.0.0.0 libmpi.so libopen-rte.so.0.0.0 libmpi_f77.la libmpi.so.0 mpi.mod libmpi_f77.so libmpi.so.0.0.0 openmpi
Notice the libXXX.so dynamic
libraries, as opposed to the static libXXX.a ones.
Recompile LAM (configuring
with --enable-shared)
or
OMPI (not overriding with --disable-shared
the default) if you
have no .so
libraries. Do
not forget to double-check that the same
installation is
available in all nodes.
For instance, this should find the same libraries
in all nodes
(change hostnames hn to match your cluster):
$ for i in h1 h2 h4 h7 h8; do ssh $i ldd `which mpirun`; done $ for i in h1 h2 h4 h7 h8; do ssh $i ldd `which mpirun` | grep lam; done $ for i in h1 h2 h4 h7 h8; do ssh $i ldd `which mpirun` | grep open; done $ for i in h1 h2 h4 h7 h8; do ssh $i ldd `which mpirun` | grep mpi; done
|
|
|
 |
Make sure your
octave
executable
has been compiled with DLD support, and it's the
same version in
all cluster nodes. Type octave_config_info
at the octave
prompt and look for the "dld" flag, if you're not
sure. If you
don't have it, recompile Octave using the config
switch --enable-shared.
$ octave octave:1> octave_config_info ans = { dld = 1 ... DL_LDFLAGS = -shared ENABLE_DYNAMIC_LINKING = true ... SHARED_LIBS = true SHLEXT = so SHLEXT_VER = so.2.9.10 SH_LD = g++ SH_LDFLAGS = -shared SONAME_FLAGS = -Wl,-soname -Wl,oct-conf.h STATIC_LIBS = false version = 2.9.10 ... unix = 1 windows = 0 }
Notice the dld=1
line.
Recompile Octave (configuring with --enable-shared)
if you haven't
it.
For instance, you could use these to check version
in all
nodes:
$ for i in h1 h2 h4 h7 h8; do ssh $i which octave ; done $ for i in h1 h2 h4 h7 h8; do ssh $i octave -v | head -1; done
|
|
|
 |
Make sure the
octave binary
you
wish/choose is in your search path on each node as
well as the
desired/chosen LAM/OMPI binaries.
Also double-check that the
corresponding LAM/OMPI libraries
(same version / configure
switches as the chosen LAM/OMPI binaries)
are included in your
LD_LIBRARY_PATH.
That's frequently overlooked. Lots of pitfalls
here: editing
.profile instead of .bashrc,
using double instead of single
quotes in ssh commands, believing an ssh
interactive session will
have the same environment as a remote ssh
command... Upgrading headed nodes and forgetting
about slave nodes is common pitfall too.
For instance, use this command, changing hostnames
as appropriate:
$ for i in h1 h2 h4 h7 h8; do ssh $i echo $i : '$PATH'; done h1 : /home/user/octave-2.9.10/bin:/home/user/lam-7.1.3/bin:/usr/bin:[...snip...]:/home/user/bin h2 : /home/user/octave-2.9.10/bin:/home/user/lam-7.1.3/bin:/usr/bin:[...snip...]:/home/user/bin h4 : [...snip...]
$ for i in h1 h2 h4 h7 h8; do ssh $i echo $i : '$LD_LIBRARY_PATH'; done h1 : /home/user/lam-7.1.3/lib: h2 : /home/user/lam-7.1.3/lib: h4 : /home/user/lam-7.1.3/lib: h7 : /home/user/lam-7.1.3/lib: h8 : /home/user/lam-7.1.3/lib:
Don't use double quotes instead of single ones
around PATH
(Read bash manpage, section on
QUOTING)
You could also use the "env"
command
instead of "echo...". You can also use "which octave"
or
"which
mpirun", just to make sure.
$ for i in h1 h2 h4 h7 h8; do ssh $i echo $i : `which octave`; done h1 : alias octave=octave -q /home/user/octave-2.9.10/bin/octave h2 : alias octave=octave -q /home/user/octave-2.9.10/bin/octave h4 : alias octave=octave -q /home/user/octave-2.9.10/bin/octave h7 : alias octave=octave -q /home/user/octave-2.9.10/bin/octave h8 : alias octave=octave -q /home/user/octave-2.9.10/bin/octave
|
|
|
|
- A shared (among nodes, cluster-wide) HOME is
strongly advised,
as well as installing there all the required
software. That way,
you won't
depend on anybody else. If not, you
may need to play with
your configuration and ask your sysadmin until
the above
requirements (or equivalent ones for non-ssh
clusters) are met. Most common pitfalls are
static-library/misconfigured LAM installs and
different Octave/LAM/OMPI
versions on different cluster nodes.
|
|
|
|
- Unzip & untar the toolbox. Recomended
location
is under
~/octave, so a new directory ~/octave/mpitb
appears.
- Enter the new mpitb subdir, read/edit the lam-bhost.def
file to describe your
cluster and lamboot
your
LAM (not applicable if using OMPI).
- Run there octave
and
try some MPITB command such as MPI_Initialized
or MPI_Init. It
might work out-of-the-box.
- If not, your octave version or LAM/OMPI
configure
switches are
probably too different and MPITB needs a
remake: just enter the src
subdir, make sure the commands "octave-config",
"mkoctfile",
"(lam/ompi_)info"
and "mpiCC"
all work
and correspond to your desired Octave /
LAM/OMPI --enable-shared
versions, and type "make".
- Read the README
file
for more detailed instructions. Read the
Makefile to understand why
those 4 commands are required.
- Work through the demos to learn to use the
toolbox. The recommended order is Hello,
Pi, PingPong, Mandelbrot,
NPB/EP.
- Remember to halt LAM when you leave (also
not much of a problem if you
let the daemons there)
|
|
|
|