Skip to content

shmem_init fails in ompi_comm_activate_nb #12250

@devreal

Description

@devreal

The hello_oshmem_c.c example fails in shmem_init with a SEGFAULT in ompi_comm_activate_nb:

0  /home/jschuchart/opt/ucx-1.14.1/lib/libucs.so.0(ucs_handle_error+0x294) [0x7fbd0ddea7a4]
 1  /home/jschuchart/opt/ucx-1.14.1/lib/libucs.so.0(+0x3295c) [0x7fbd0ddea95c]
 2  /home/jschuchart/opt/ucx-1.14.1/lib/libucs.so.0(+0x32b87) [0x7fbd0ddeab87]
 3  /lib64/libc.so.6(+0x54db0) [0x7fbd0e18cdb0]
 4  /lib64/libc.so.6(+0xb9715) [0x7fbd0e1f1715]
 5  /home/jschuchart/opt/ompi-main/lib/libmpi.so.0(+0x7b8e9) [0x7fbd0e3c88e9]
 6  /home/jschuchart/opt/ompi-main/lib/libmpi.so.0(ompi_comm_activate_nb+0x454) [0x7fbd0e3c6de3]
 7  /home/jschuchart/opt/ompi-main/lib/libmpi.so.0(ompi_comm_activate+0x4c) [0x7fbd0e3c6ea0]
 8  /home/jschuchart/opt/ompi-main/lib/libmpi.so.0(ompi_comm_create_group+0x23d) [0x7fbd0e3c0734]
 9  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(mca_scoll_mpi_comm_query+0x1a4) [0x7fbd0ea7e594]
10  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(+0x1892c4) [0x7fbd0ea7d2c4]
11  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(+0x189288) [0x7fbd0ea7d288]
12  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(+0x189158) [0x7fbd0ea7d158]
13  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(+0x188f8f) [0x7fbd0ea7cf8f]
14  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(mca_scoll_base_select+0x16e) [0x7fbd0ea7bf0f]
15  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(mca_scoll_enable+0x9c) [0x7fbd0ea7a802]
16  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(+0x4a3dc) [0x7fbd0e93e3dc]
17  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(oshmem_shmem_init+0xae) [0x7fbd0e93de8d]
18  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(+0x4dd32) [0x7fbd0e941d32]
19  /home/jschuchart/opt/ompi-main/lib/liboshmem.so.0(shmem_init+0x19) [0x7fbd0e941c87]
20  ./hello_oshmem() [0x4011a3]
21  /lib64/libc.so.6(+0x3feb0) [0x7fbd0e177eb0]
22  /lib64/libc.so.6(__libc_start_main+0x80) [0x7fbd0e177f60]
23  ./hello_oshmem() [0x4010c5]

This happens if there are more than one processes launched. I cannot get a proper backtrace on that machine, unfortunately.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions