-
Notifications
You must be signed in to change notification settings - Fork 5.2k
[mono] Fix read of image_info while raising image_unload event #67138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Might be better to call image_unloaded earlier instead of calling it on a basically freed structure. |
|
I'm not sure what the exact distinction between |
|
@lateralusX @josalem What's the difference between the I think |
|
I would also not be opposed to a new MonoProfiler |
|
I guess the main relationship between the events is that one should come before the other in the stream. The second things they can be used it to get an idea about how long it takes to unload something, so putting them closer will reduce ability to calculate that time, but don't think that is super important. If the Mono profiler API hands over something as part of the callback, it needs to include live data, or NULL:ed out to indicate that its is not available. |
|
For the unload profiler event it looks like a lot of things can be freed up in the image struct, and most logic in |
I think the problem is the design of the profiler API, if it pass a data structure as part of its API, that needs to be accessible or NULL:ed out, if not, it should use a different API altogether, not passing a |
Sounds good to me @stefan-sf-ibm could you update this PR to move the |
|
image_unloaded should probably come after other events caused by the unloading, i.e. references to other images are dropped causing them to be unloaded as well. |
90d0ac1 to
fe59d6b
Compare
|
Pushed the unloaded event upwards which means it is pretty close to the unloading event, now. Anyhow, it should come before |
|
Looks like mono_image_close_except_pools_all will trigger more unload events, so if we are going to capture the right order of unload events, it looks like it can't move much at all. |
|
So what image->assembly (just converting it to uint64_t) and for the EventPipe unloaded event, we are pretty much only using: pointer to image itself, that should still be valid. so one option could be to make sure that atleast these fields are still available when we do the profiler callback. |
|
So maybe delay parts of this, runtime/src/mono/mono/metadata/image.c Line 2165 in 58fa90b
|
|
I think after runtime/src/mono/mono/metadata/image.c Line 2163 in 58fa90b
pointer runtime/src/mono/mono/metadata/image.c Line 2165 in 58fa90b
probably won't help. As you already proposed it might be better to just change the API. I gave it a try in the following commit and basically copied since the guid is required by Benefit would be that the time between events unloading and unloaded would have a meaning and beside that the order of further events, possibly triggered by Any thoughts about this approach? I'm not sure whether it is worthwhile to change the API for all other image related events like |
fe59d6b to
5063024
Compare
|
Problem with changing existing API is that this is a public API, so that would break other profiler providers, if we are going down that route we will need to add a new API for the unloaded event, aligned with what @lambdageek suggested above and then let EventPipe profiler provider hook up to the new API, instead of using the old. Profiler still needs to fire both events (inexpensive if there are no listeners). If we are going to change to this route, then we should add a new API, that takes more details around image, probably all the data needed by An alternative to simplify to profiler API, is adding one new image unloading/unloaded API, and then have a enum telling if its unloading or unloaded, first call to unloading pass in Image *, and take back a void * returned from profiler provider, second call to pass NULL as Image and return back void * to profiler provider. That way, the provider could implement the logic on data it would like to preserve between calls, and then free that up on second call. Putting both into one API will prevent an implementer to only subscribe to one of them, since both are needed, or we could have two new API's and then its up to the provider to subscribe to both if it returns data that should go into the second call. if we add two new API's, it could look like this: mono_image_unloading_v2(size_t id, Image *, void **); and we can keep all logic around what data that needs to be preserved in EventPipe.
runtime/src/mono/mono/eventpipe/ep-rt-mono.c Line 3146 in 0e35b8d
|
|
Right I somehow managed to neglect the fact that this is a public API. Sorry for the noise. I will restore the old commit and delay freeing of |
5063024 to
cb28f00
Compare
|
Failure in debugger tests might be related, https://dev.azure.com/dnceng/public/_build/results?buildId=1688158&view=logs&j=2cb5cbbb-10b2-57f5-cb06-88ab02df11cd&t=8e3b5183-1166-552f-0d9e-c49b4956b6ef |
|
I cannot reproduce the failure. Running the testsuite locally I get Interestingly the local summary mentions far more test cases than the one from CI. Just to be sure: Is this the correct test suite |
|
It could be a flaky test, just saw that test seemed to have code that read out the image GUID and did some lazy loading: [xUnit.net 00:02:53.57] DebuggerTests.BreakpointTests.DebugHotReloadMethodUnchanged [FAIL] its in src/mono/wasm/debugger/DebuggerTestSuite/BreakpointTests.cs so should be part of the test suite you run locally. @thaystg Any thoughts around the failure above on this PR, or just flaky test? PR does set image GUID to NULL after freeing it, compared to just freeing it but keep dangling pointer in Image * type and test seems to do some lazy assembly loading and looks like the test at least access the image GUID, but looks like the raw binary version and not the string version, so should be unrelated. |
|
/azp run runtime-wasm |
|
Azure Pipelines successfully started running 1 pipeline(s). |
cb28f00 to
f091933
Compare
|
Rebased on main as well as testing to temporary remove setting image->guid to NULL to investigate if WASM tests still fails. |
|
All pass, when not setting image->guid to NULL, let's try to revert it back and see test results. |
While raising image_unload event
ep_rt_mono_write_event_module_unloadis called which itself callsget_module_event_datawhich accessesimage->image_info. Thus setimage->image_infotoNULLafter free.