Skip to content

Commit 63e86c6

Browse files
l1ll1vsoch
authored andcommitted
Update based on additional info provided by Greg in the mailing list (#74)
thread "mpi and portability" on 2017-05-11/12
1 parent e1a181b commit 63e86c6

File tree

1 file changed

+15
-10
lines changed

1 file changed

+15
-10
lines changed

pages/docs/admin-docs/docs-hpc.md

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,10 @@ We are in the process of developing Singularity Hub, which will allow for genera
1616

1717
- The <a href="http://sherlock.stanford.edu" target="_blank" class="no-after">Sherlock cluster</a> at <a href="https://srcc.stanford.edu/" class="no-after" target="_blank">Stanford University</a>
1818
- <a href="https://www.xsede.org/news/-/news/item/7624" target="_blank" class="no-after">SDSC Comet and Gordon</a> (XSEDE)
19+
- <a href="http://docs.massive.org.au/index.html" target="_blank" class="no-after">MASSIVE M1 M2 and M3</a> (Monash University and Australian National Merit Allocation Scheme)
1920

2021
### Integration with MPI
21-
Another result of the Singularity architecture is the ability to properly integrate with the Message Passing Interface (MPI). Work has already been done for out of the box compatibility with Open MPI (both in Open MPI v2.x as well as part of Singularity). The Open MPI/Singularity workflow works as follows:
22+
Another result of the Singularity architecture is the ability to properly integrate with the Message Passing Interface (MPI). Work has already been done for out of the box compatibility with Open MPI (both in Open MPI v2.1.x as well as part of Singularity). The Open MPI/Singularity workflow works as follows:
2223

2324
1. mpirun is called by the resource manager or the user directly from a shell
2425
2. Open MPI then calls the process management daemon (ORTED)
@@ -27,7 +28,9 @@ Another result of the Singularity architecture is the ability to properly integr
2728
5. Singularity then launches the MPI application within the container
2829
6. The MPI application launches and loads the Open MPI libraries
2930
7. The Open MPI libraries connect back to the ORTED process via the Process Management Interface (PMI)
30-
8. At this point the processes within the container run as they would normally directly on the host at full bandwidth! This entire process happens behind the scenes, and from the user's perspective running via MPI is as simple as just calling mpirun on the host as they would normally.
31+
8. At this point the processes within the container run as they would normally directly on the host.
32+
33+
This entire process happens behind the scenes, and from the user's perspective running via MPI is as simple as just calling mpirun on the host as they would normally.
3134

3235
Below are example snippets of building and installing OpenMPI into a container and then running an example MPI program through Singularity.
3336

@@ -38,8 +41,14 @@ Below are example snippets of building and installing OpenMPI into a container a
3841

3942
#### MPI Development Example
4043

41-
**What are supported Open MPI Version(s)?**
42-
To achieve proper container'ized Open MPI support, you must use Open MPI version 2.1. Open MPI version 2.1.0 includes a bug in its configure script affecting some interfaces (at least Mellanox cards operating in RoCE mode using libmxm). For this reason, we show the example first.
44+
**What are supported Open MPI Version(s)?**
45+
To achieve proper container'ized Open MPI support, you should use Open MPI version 2.1. There are however three caveats:
46+
1. Open MPI 1.10.x *may* work but we expect you will need exactly matching version of PMI and Open MPI on both host and container (the 2.1 series should relax this requirement)
47+
2. Open MPI 2.1.0 has a bug affecting compilation of libraries for some interfaces (particularly Mellanox interfaces using libmxm are known to fail). If your in this situation you should use
48+
the master branch of Open MPI rather than the release.
49+
3. Using Open MPI 2.1 does not magically allow your container to connect to networking fabric libraries in the host. If your cluster has, for example, an infiniband network you still need to install OFED libraries into the container. Alternatively you could bind mount both Open MPI and networking libraries into the container, but this could run afoul of glib compatibility issues (its generally OK if the container glibc is more recent than the host, but not the other way around)
50+
51+
#### Code Example using Open MPI 2.1.0 Stable
4352

4453
```bash
4554
$ # Include the appropriate development tools into the container (notice we are calling
@@ -51,10 +60,6 @@ $ wget https://www.open-mpi.org/software/ompi/v2.1/downloads/openmpi-2.1.0.tar.b
5160
$ tar jtf openmpi-2.1.0.tar.bz2
5261
$ cd openmpi-2.1.0
5362
$
54-
$ # Build OpenMPI in the working directory, using the tool chain within the container
55-
$ # This step is unusual in a stable release but there is a bug in the configure script
56-
$ # affecting some interfaces
57-
$ singularity exec /tmp/Centos-7.img ./autogen.pl
5863
$ singularity exec /tmp/Centos-7.img ./configure --prefix=/usr/local
5964
$ singularity exec /tmp/Centos-7.img make
6065
$
@@ -72,9 +77,9 @@ $ mpirun -np 20 singularity exec /tmp/Centos-7.img /usr/bin/ring
7277

7378
```
7479

75-
#### Code Example
80+
#### Code Example using Open MPI git master
7681

77-
The following example (using their master) should work fine on most hardware but if you have an issue, try running this example below:
82+
The previous example (using the Open MPI 2.1.0 stable release) should work fine on most hardware but if you have an issue, try running the example below (using the Open MPI Master branch):
7883

7984
```bash
8085
$ # Include the appropriate development tools into the container (notice we are calling

0 commit comments

Comments
 (0)