Skip to content

Conversation

@badgerious
Copy link
Contributor

Once any number of events are read, return immediately, rather than
waiting for fi_cq_read() to return FI_EAGAIN or an error. This can
improve observed latency if the user application is in a blocking call
waiting for us to return. Breaking here also means
ofi_progress_event_count serves as an upper bound for the total number
of events read in a single call (without it, we could read far more, as
long as new events continue to arrive).

Signed-off-by: Eric Badger [email protected]

@ompiteam-bot
Copy link

Can one of the admins verify this patch?

Copy link
Member

@bwbarrett bwbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that progress_event_count is fairly large, this seems like a good balance between latency and efficiency. However, given the rest of the loop, the right fix should be to remove the while (true) {, rather than just adding a break.

@bwbarrett
Copy link
Member

ok to test

Once any number of events are read, return immediately, rather than
waiting for fi_cq_read() to return FI_EAGAIN or an error. This can
improve observed latency if the user application is in a blocking call
waiting for us to return. Deleting the while loop here also means
ofi_progress_event_count serves as an upper bound for the total number
of events read in a single call (with the while loop we might read far
more, as long as new events continue to arrive).

Signed-off-by: Eric Badger <[email protected]>
@badgerious badgerious force-pushed the mtl_ofi_cqread_break branch from 4046445 to 35dbc18 Compare June 11, 2020 17:11
@badgerious
Copy link
Contributor Author

However, given the rest of the loop, the right fix should be to remove the while (true) {, rather than just adding a break.

Good point. I've reorganized it to drop the while loop.

@bwbarrett
Copy link
Member

bot:aws:retest - looks like clang37 had a hang running the connectivity test.

@awlauria
Copy link
Contributor

bot:ompi:retest

3 similar comments
@awlauria
Copy link
Contributor

bot:ompi:retest

@awlauria
Copy link
Contributor

bot:ompi:retest

@awlauria
Copy link
Contributor

awlauria commented Jul 2, 2020

bot:ompi:retest

@awlauria awlauria merged commit dbc5675 into open-mpi:master Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants