-
Notifications
You must be signed in to change notification settings - Fork 936
mtl/ofi: break from progress loop when events are read #7802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
bwbarrett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that progress_event_count is fairly large, this seems like a good balance between latency and efficiency. However, given the rest of the loop, the right fix should be to remove the while (true) {, rather than just adding a break.
|
ok to test |
Once any number of events are read, return immediately, rather than waiting for fi_cq_read() to return FI_EAGAIN or an error. This can improve observed latency if the user application is in a blocking call waiting for us to return. Deleting the while loop here also means ofi_progress_event_count serves as an upper bound for the total number of events read in a single call (with the while loop we might read far more, as long as new events continue to arrive). Signed-off-by: Eric Badger <[email protected]>
4046445 to
35dbc18
Compare
Good point. I've reorganized it to drop the while loop. |
|
bot:aws:retest - looks like clang37 had a hang running the connectivity test. |
|
bot:ompi:retest |
3 similar comments
|
bot:ompi:retest |
|
bot:ompi:retest |
|
bot:ompi:retest |
Once any number of events are read, return immediately, rather than
waiting for fi_cq_read() to return FI_EAGAIN or an error. This can
improve observed latency if the user application is in a blocking call
waiting for us to return. Breaking here also means
ofi_progress_event_count serves as an upper bound for the total number
of events read in a single call (without it, we could read far more, as
long as new events continue to arrive).
Signed-off-by: Eric Badger [email protected]