Skip to content

[HIP][CUDA] Refactor using profiling events #1634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 5, 2024

Conversation

keyradical
Copy link
Contributor

@keyradical keyradical commented May 20, 2024

HIP changes:

  • To match with the current behaviour of CUDA adapter, EvBase in HIP was moved to device and getElapsedTime function now handles the profiling events' synchronization. Also, we were using hipStreamDefault flag as default, but in CUDA, we use CU_STREAM_NON_BLOCKING, this was also changed to match with cuda, commits 1-3
  • Added an extra profiling stream to Queue which is only created when profiling is enabled and it is used to record EvQueued. This was necessary because before we were recording it on the NULL stream and this might not be the best solution for HIP, see HIP profiling submission time query returns weird values intel/llvm#12904, commit 4

CUDA changes:

  • Also added the extra profiling stream for consistency, commit 5

intel/llvm CI: intel/llvm#13861

@keyradical keyradical requested review from a team as code owners May 20, 2024 16:13
@github-actions github-actions bot added cuda CUDA adapter specific issues hip HIP adapter specific issues labels May 20, 2024
// Stream used solely when profiling is enabled
native_type ProfStream;
bool IsProfStreamCreated{false};
std::once_flag ProfStreamFlag;
Copy link
Contributor

@hdelan hdelan May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to keep this in the queue struct? Or could we have it local to createProfilingStream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, having static std::once_flag local to createProfilingStream was breaking some tests, not sure why...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, having static std::once_flag local to createProfilingStream was breaking some tests, not sure why...

Looks like this worked now.

@hdelan
Copy link
Contributor

hdelan commented May 22, 2024

Also I wouldn't mind if ProfStream had a different name. Maybe something like HostSubmitTimeStream with a comment to explain what it is used for

@keyradical keyradical force-pushed the RefactorHIPBaseEvent branch 2 times, most recently from 178dc5e to d0cc077 Compare May 23, 2024 10:25
Copy link
Contributor

@hdelan hdelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff. LGTM

@keyradical keyradical force-pushed the RefactorHIPBaseEvent branch from ffc00a1 to d082057 Compare May 24, 2024 14:23
@keyradical keyradical added the ready to merge Added to PR's which are ready to merge label May 27, 2024
@kbenzie kbenzie merged commit 42c0b02 into oneapi-src:main Jun 5, 2024
steffenlarsen added a commit to steffenlarsen/llvm that referenced this pull request Jun 24, 2024
oneapi-src/unified-runtime#1634 is believed to
have fixed the issues for HIP in the profiling tag extension. This
commit reenables the tests for HIP.

Closes intel#12904.

Signed-off-by: Larsen, Steffen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA adapter specific issues hip HIP adapter specific issues ready to merge Added to PR's which are ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants