-
Notifications
You must be signed in to change notification settings - Fork 936
Fix typos in OSC RDMA BTL allowlist #7823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@bwbarrett @awlauria @rhc54 This is the cause of so many of the Cisco MTT failures over the past month or two. |
|
Either someone is going to put in effort to fix the PT2PT component or fix the RDMA component. Why would we put effort into the PT2PT component? |
|
I believe nobody disputes that statement - the issue is: who is going to put in the effort? Simply removing p2p isn't the answer. To date, nobody has been willing to make the effort. Perhaps providing a clean mechanism by which BTLs can reject OSC operations with an appropriate error would solve it - but somebody would have to make the effort to create that too. 🤷♂️ |
|
I will take a look at btl/tcp. |
|
Even if BTL tcp is fixed, do users running on a single node (e.g., a laptop) have to run Same question for usNIC: if I'm using usNIC (which can't be used for loopback communication), do I have to run |
"openib" no longer exists. "tcp" had a typo. Signed-off-by: Jeff Squyres <[email protected]>
fd65517 to
18cfcc8
Compare
|
Per discussion on the weekly OMPI call today, I changed this PR to solely remove |
| OBJ_RELEASE(new_enum); | ||
|
|
||
| ompi_osc_rdma_btl_names = "openib,ugni,uct,ucp"; | ||
| ompi_osc_rdma_btl_names = "ugni,uct,tcp"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should ofi be in this list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I defer to @hjelmn to answer that...
bwbarrett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to approve earlier...
|
bot:ompi:retest |
1 similar comment
|
bot:ompi:retest |
|
One more time.. |
|
bot:ompi:retest |
OSC rdma had a reference to
openib, which no longer exists on master. It also had a typo for thetcpBTL (but even after fixing that typo, OSC rdma does not activate itself when the TCP BTL is used).