-
Notifications
You must be signed in to change notification settings - Fork 927
Description
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
v4.0.3
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
I did not make the installation. I'm trying to get this information from who did it.
Please describe the system on which you are running
- Operational System: Red Hat 4.4.7-23
- Cluster with slurm 15.08.7
- Computer Hardware: Intel(R) Xeon(R) CPU E5-2683 v4
- Network type: Infiniband
Details of the problem
I'm trying to run a program with repetitive communication in ring type, as the example test.cpp
. When I run the example on my cluster using multiple nodes with the command salloc -N2 --hint=compute_bound --exclusive mpirun test.o
(because I am using slurm) the output is similar to
ID 0 Time 6.001007
ID 1 Time 6.001102
However, I was waiting times of 3.0 and 6.0 seconds approximately. This wrong behavior in general happened when I am not using RDMA. Then, I decided to force the program to use the rdma using the command salloc -N2 --hint=compute_bound --exclusive mpirun --mca osc rdma test.o
. However, I've received the following error
[r1i1n10:08761] *** An error occurred in MPI_Win_allocate
[r1i1n10:08761] *** reported by process [418054145,1]
[r1i1n10:08761] *** on communicator MPI_COMM_WORLD
[r1i1n10:08761] *** MPI_ERR_WIN: invalid window
[r1i1n10:08761] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[r1i1n10:08761] *** and potentially your MPI job)
[service0:31286] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[service0:31286] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
I did not expect this because I have an Infiniband on the cluster. The first case already is strange, the second I do not understand the error.
Observation
- The program runs as expected when it is used a single node with multiple processes using or not
--mca osc rdma
. - One time ago, I ran the same code and it worked. I've not noticed any change in the code or environment to now.
test.cpp
#include <iostream>
#include <unistd.h>
#include <mpi.h>
#include <math.h>
int main(int argc, char *argv[])
{
MPI_Win window;
int id, comm_sz;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
int get_number;
int next = (id+1)%comm_sz;
double t;
int *window_buffer;
MPI_Win_allocate(sizeof(int), sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &(window_buffer), &(window));
t = MPI_Wtime();
for (int i = 0; i < 3; i++) {
sleep(id+1);
MPI_Win_lock(MPI_LOCK_SHARED, next, 0, window);
MPI_Get(&get_number, 1, MPI_INT, next, 0, 1, MPI_INT, window);
MPI_Win_unlock(next, window);
}
printf("ID %i Time %lf\n", id, MPI_Wtime()-t);
MPI_Finalize();
return 0;
}