Skip to content

Conversation

LanderlYoung
Copy link
Contributor

@LanderlYoung LanderlYoung commented Aug 21, 2025

  1. fix symbol map line parse
  2. unescape ascii code
  3. update tests: test/runner "other.test_emsymbolizer*"

This PR fixes #24982

@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch 3 times, most recently from 549ccad to d792fa6 Compare August 21, 2025 11:41
@LanderlYoung LanderlYoung changed the title fix #24982: emsymbolizer failed to parse symbol map from C++ project Fix #24982: emsymbolizer failed to parse symbol map from C++ project Aug 21, 2025
Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise

@sbc100
Copy link
Collaborator

sbc100 commented Aug 25, 2025

Is there some reason we cannot or should not just update the symbol map to avoid this mangling in the first place.

What is the source of the mangling in the first place? What style of mangling is std::out_of_range::~out_of_range\28\29 ?

@dschuff
Copy link
Member

dschuff commented Aug 25, 2025

I don't know what kind of mangling that is; it doesn't exactly match others that I'm familiar with.
My assumption was that there are users that are depending on the contents of the symbol map. and would have some kind of workflow that would be broken if we were to change it. Maybe I'm wrong and we could just do that. Or, maybe it's ess of a breaking change to remove the special mangling than to go to a fully-C++-mangled state, since presumably users who depend on the current format would have their own demangling, so if we just stopped adding the mangling, maybe they wouldn't be broken?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 25, 2025

I don't know what kind of mangling that is; it doesn't exactly match others that I'm familiar with. My assumption was that there are users that are depending on the contents of the symbol map. and would have some kind of workflow that would be broken if we were to change it. Maybe I'm wrong and we could just do that. Or, maybe it's ess of a breaking change to remove the special mangling than to go to a fully-C++-mangled state, since presumably users who depend on the current format would have their own demangling, so if we just stopped adding the mangling, maybe they wouldn't be broken?

I'd be tempted to make that change to the symbol map to include fully demanged symbols. @kripken do you remember why we have these strange escape codes in the map file?

@kripken
Copy link
Member

kripken commented Aug 26, 2025

I'm not sure. But isn't the symbol map just copying the name section? It should contain whatever is there iirc.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

So maybe the issue is that the name section is broken?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

So maybe the issue is that the name section is broken?

The name sections is fine. The problem seems to stem from the wasm-opt --print-function-map command.. I guess binaryen doesn't like those symbols.

I think we should file a binaryen bug, or maybe just avoid binaryen for this purpose (since we don't need to run process the whole wasm file, only the name section.

@kripken
Copy link
Member

kripken commented Aug 26, 2025

@sbc100 do you want this PR to wait until we figure that out, or can this land?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

@sbc100 do you want this PR to wait until we figure that out, or can this land?

I don't think we we should land this since it would only act to further lock in the strange escaping behavior

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should go with #25053 instead

@kripken
Copy link
Member

kripken commented Aug 26, 2025

Oh, I think the first part of this PR is still important though:

fix symbol map line parse

That handles splitting when the line contains :: as a separator.

We can refocus this PR on just that?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

Oh, I think the first part of this PR is still important though:

fix symbol map line parse

That handles splitting when the line contains :: as a separator.

We can refocus this PR on just that?

Ah, yes that could make sense.

@LanderlYoung LanderlYoung requested a review from sbc100 August 27, 2025 03:17
@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from 3a59a68 to b881a2e Compare August 27, 2025 03:30
@LanderlYoung
Copy link
Contributor Author

Thanks guys! The name escape part has been reverted. Please review 🌹

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just a couple of nits on the test.

@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from af71e60 to b2ffeb2 Compare August 28, 2025 06:32
@LanderlYoung
Copy link
Contributor Author

Some other.test_codesize* tests seemed to fail, but I didn't find a relation with this code change.

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you merge the PR to the tip of the main branch, the code size test failures should go away. And you'll have to do that anyway, since you have a conflict.

@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch 3 times, most recently from 1892db6 to 0c6cba1 Compare September 6, 2025 06:23
@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from 0c6cba1 to 253e27d Compare September 6, 2025 06:26
@sbc100
Copy link
Collaborator

sbc100 commented Sep 8, 2025

Thanks for you patience on this PR. LGTM with one final comment!

@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from 11dff9f to 0cbf38d Compare September 9, 2025 12:19
@dschuff
Copy link
Member

dschuff commented Sep 9, 2025

browser test failures are unrelated, I'll go ahead and merge this. Thanks!

@dschuff dschuff merged commit 75b6166 into emscripten-core:main Sep 9, 2025
28 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

emsymbolizer failed to parse symbol map from C++ project
4 participants