-
Notifications
You must be signed in to change notification settings - Fork 926
v5.0.x: Pass oversubscribe status to MPI layer #9026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Port of #8998. This cannot be a direct cherry-pick as it requires update of the PMIx and PRRTE release branch pointers instead of their master branch equivalents. bot:notacherrypick Signed-off-by: Ralph Castain <[email protected]>
@rhc54 Do we need the OMPI code as well? All I see is the PMIx / PRRTE submodule updates here. Also, it looks like a legit compile fail in CI:
|
Jeff, Jeff, Jeff....I gather you failed to read the note on the quoted PR where I expressly directed that you need to add the OMPI bits?
Yeah, it's been reported on PRRTE as well. It's a VPATH issue. Sigh - someday they will outlaw that thing! |
Update OMPI to check for PMIx attribute and set `ompi_mpi_oversubscribe` accordingly. Move logic for setting yield_when_idle to a place after the oversubscribe flag has been checked. - change logic of setting ompi_mpi_yield_when_idle - nit: change `ompi_mpi_oversubscribe` to `ompi_mpi_oversubscribed` - add comment in ompi/runtime/params.h This is a cherry-pick of the Open MPI parts of 2b335ed. The Open PMIx / PRRTE git submodule updates are in a different commit on this PR (because they're different than the git submodule updates on master). Signed-off-by: Ralph Castain <[email protected]> Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 2b335ed)
Signed-off-by: Ralph Castain <[email protected]>
@jsquyres Should be ready to go now. |
The IBM CI (XL) build failed! Please review the log, linked below. Gist: https://gist.github.com/773712119c3f0750dfbdbcede978e795 |
I'm sure there must be an error somewhere in the XL compile - but how is anyone supposed to find it in the midst of that ridiculous tsunami of warnings:
|
The IBM CI (PGI) build failed! Please review the log, linked below. Gist: https://gist.github.com/e4f760bdc20b51ccd2fb14a3b2c4d9c1 |
@awlauria will take a look at IBM CI failures today. Thanks! |
IBM CI machine was overloaded causing timeouts. The issue should be resolved now. |
Thanks! |
Port of #8998. This cannot be a direct cherry-pick
as it requires update of the PMIx and PRRTE release
branch pointers instead of their master branch
equivalents.
bot:notacherrypick
Signed-off-by: Ralph Castain [email protected]