Skip to content

Conversation

0x12CC
Copy link
Contributor

@0x12CC 0x12CC commented Sep 26, 2025

This PR extends Clang's source-based code coverage to work with SYCL device code. It includes the following changes:

  1. The InstrProfilingLoweringPass was updated to lower profiling counters to SYCL device globals.
  2. A new function was added to the compiler runtime to increment the host-side profiling counters.
  3. The SYCL runtime was updated to send the contents of the device-side profiling counters to the new compiler runtime function when their device global map entries are being freed.

This feature may not work correctly for functions that differ between host and device due to the use of the __SYCL_DEVICE_ONLY__ macro. In such cases, it may not be possible to correlate the profiling counters from the device to the host. Resolves #7803.

Signed-off-by: Michael Aziz <[email protected]>
clang-offload-extract
clang-offload-packager
clang-linker-wrapper
compiler-rt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do other projects enable profiling unconditionally in the build too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only remaining question from me, the rest is 🔥 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know of any projects that enable profiling unconditionally. This change enables support for coverage instrumentation in our build but does not enable the instrumentation itself. I think it makes sense to do this unconditionally so that the related functionality in the SYCL runtime can be tested as part of our regular end-to-end testing process.

I'd like to add a build flag for building the SYCL runtime with instrumentation enabled. This flag would allow us to produce coverage reports and identify areas in the runtime that are not covered by tests. I plan to do this in a separate PR and make the instrumentation disabled by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to do this unconditionally so that the related functionality in the SYCL runtime can be tested as part of our regular end-to-end testing process.

I'd argue that that would be a reason to enable it in configure.py under --ci-defaults, but I'd still think that this unconditional dependency here is too much.

That said, if others are ok, I'm not going to block PR on this.

@aelovikov-intel
Copy link
Contributor

Would it make sense to add a high-level design document here as well (unless one exists already). Also, what about kernels using optional features? Are we merging info collected by different device images somehow? That needs to be explained either in some doc or at the PR title.

@0x12CC
Copy link
Contributor Author

0x12CC commented Sep 26, 2025

Would it make sense to add a high-level design document here as well (unless one exists already).

I think this makes sense. I'll add a document to sycl/doc/design.

Also, what about kernels using optional features?

I'm not sure I understand the question. If a kernel uses some features that prevent it from being submitted to a device, I would expect the corresponding profiling counters to contain zero values and not change the resulting coverage report.

Are we merging info collected by different device images somehow? That needs to be explained either in some doc or at the PR title.

The profiling counters collected from any device image are added to the host's profiling counters. If there are multiple device images that share device code, each one will increase the same profiling counters on the host. The assumption is that the host and device compilers instrument the same functions so it's possible to copy the counter values from the device to the host. The resulting profile output contains the values of the host profiling counters at the end of the program's execution. I'll make this clear in the design document.

Copy link
Contributor

@srividya-sundaram srividya-sundaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver changes LGTM

Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but ping me when the design doc is ready for another round :)

Signed-off-by: Michael Aziz <[email protected]>
Signed-off-by: Michael Aziz <[email protected]>
0x12CC and others added 2 commits September 29, 2025 13:32
0x12CC and others added 2 commits September 29, 2025 13:33
Signed-off-by: Michael Aziz <[email protected]>
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! ⭐

Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Signed-off-by: Michael Aziz <[email protected]>
@0x12CC 0x12CC requested a review from bader October 6, 2025 15:21
@0x12CC
Copy link
Contributor Author

0x12CC commented Oct 6, 2025

@bader, @aelovikov-intel, could you please review the updated changeset and let me know if any other changes are required?

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compiler-rt/lib/profile/InstrProfilingRuntime.cpp changes look good to me.

@aelovikov-intel
Copy link
Contributor

@bader, @aelovikov-intel, could you please review the updated changeset and let me know if any other changes are required?

@steffenlarsen 's approval is enough, I wasn't going to perform a detailed review :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Support for code coverage
8 participants