-
Notifications
You must be signed in to change notification settings - Fork 808
[SYCL] Don't return last event in ext_oneapi_submit_barrier
#20159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little surprised that a marker event would be slower than actually submitting a barrier. Of course, the adapters/backends could be optimizing the barriers on in-order queues themselves, i.e. by storing the latest event themselves or by submitting markers internally. As such, I am all for dropping the complexity from the runtime.
Yes, L0 adapter has the same optimization, that's why removing optimization from sycl level restores performance, because markers are not inserted, but optimization is still in place at adapter level. |
# https://github.com/intel/llvm/pull/20159 prevents returning last event as an | ||
# optimization for submitting barrier to an empty IOQ. However, the test | ||
# actually checks whether last event is returned or not, so it needs to be | ||
# updated to match the new behavior. ext_oneapi_submit_barrier spec doesn't | ||
# require last event to be returned, so this is not an ABI break. | ||
InorderQueue/in_order_ext_oneapi_submit_barrier.cpp | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xtian-github @gmlueck I'd need approval for excluding InorderQueue/in_order_ext_oneapi_submit_barrier.cpp
from 6.3 ABI compatibility testing. This test checks whether last event is returned by ext_oneapi_submit_barrier()
but after this PR, we no longer return last event. Returning last event is not required by ext_oneapi_submit_barrier
spec, so it is not strictly an ABI break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is OK, but just checking to make sure I understand ... The spec for queue::ext_oneapi_submit_barrier
does require that function to return an event
. However, the test was checking to see if the returned event was the last event (i.e. the event that was returned from the previous submit). This is an implementation detail, not part of the specified API. Therefore, the test being excluded is not really testing the API; it's testing the implementation. Since the implementation changed, we need to change the test also.
Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming. I approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exclusion of "InorderQueue/in_order_ext_oneapi_submit_barrier.cpp" is OK.
…20159) **Problem** As an optimization in `ext_oneapi_submit_barrier`, for in-order queue, when waitlist is empty, we return the last event, if it is available. However, after intel#18277, we no longer store last event and instead submit a marker to get the last event. And this caused performance regression in some of our internal benchmarks. **Solution** Don't return last event and instead submit a barrier with empty waitlist.
This is a cherry-pick of #20159 (and #20241) **Problem** As an optimization in `ext_oneapi_submit_barrier`, for in-order queue, when waitlist is empty, we return the last event, if it is available. However, after #18277, we no longer store last event and instead submit a marker to get the last event. And this caused performance regression in some of our internal benchmarks. **Solution** Don't return last event and instead submit a barrier with empty waitlist. Patch-by: Udit Kumar Agarwal <[email protected]>
{ | ||
// Test cast 3 - empty queue. | ||
std::cout << "Test 3" << std::endl; | ||
sycl::queue EmptyQ({sycl::property::queue::in_order{}}); | ||
auto BarrierEvent = EmptyQ.ext_oneapi_submit_barrier(); | ||
assert( | ||
BarrierEvent.get_info<sycl::info::event::command_execution_status>() == | ||
sycl::info::event_command_status::complete); | ||
BarrierEvent.wait(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this test removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexeySachkov Here's my line of reasoning:
IIUC, for IOQs, ext_submit_barrier()
is non-blocking and returns an event, whose state will transition to complete
, when all previously submitted commands to IOQ also completes. Now, when we submit a barrier to an empty IOQ, (1) should the implementation immediately return a completed event or (2) can a non-blocking implementation return an event, take some time to figure out if queue is empty or not, and if empty, transition the event to be completed?
The test checks for (1) but IIUC, spec doesn't mandate that.
Looking at UR's implementation of urEnqueueEventsWaitWithBarrierExt
, it follows (2) - it returns an event and submits zeCommandListAppendSignalEvent
to L0. That's why this test was flakily failing in pre-commit CI of this PR. Before this PR, at SYCL RT level, we check if queue is empty or not and according return a completed event or submit urEnqueueEventsWaitWithBarrierExt
. Since getting last event and checking whether queue is empty or not is expensive and (1) is not mandated by spec, I removed the SYCL RT check along with this test.
The changes in intel#20159 removed a test case checking that barriers on empty queues would be considered complete immediately. This commit reintroduces it with a fix for the case. Signed-off-by: Larsen, Steffen <[email protected]>
…0263) The changes in #20159 removed a test case checking that barriers on empty queues would be considered complete immediately. This commit reintroduces it with a fix for the case. --------- Signed-off-by: Larsen, Steffen <[email protected]>
Problem
As an optimization in
ext_oneapi_submit_barrier
, for in-order queue, when waitlist is empty, we return the last event, if it is available. However, after #18277, we no longer store last event and instead submit a marker to get the last event. And this caused performance regression in some of our internal benchmarks.Solution
Don't return last event and instead submit a barrier with empty waitlist.