Skip to content

Postpone adding globals until dynamic linking #154

@lukewagner

Description

@lukewagner

Split off from #139: I was thinking that perhaps globals should be removed from the MVP and instead go in with the dynamic linking feature. The reasoning is that, until we have to worry about dynamic linking, we can simply place globals in the heap and use compiler-chosen static offsets; indeed this is what we'll need to do anyway for any global whose address is taken (and what Emscripten does for most globals anyway, to avoid JS engine number-of-closure-variable limitations). Technically, globals have the advantage that they can't be aliased by loads/stores, but I don't expect this will win much in practice (esp. assuming a C++ compiler has already optimized the code).

Thinking forward to dynamic linking: since we have to deal with aliased globals anyway, it seems simpler and more orthogonal (not requiring a separate set of LoadGlobal/StoreGlobal ops) to have globals declare immutable pointers (i.e., integers) that would then be used as arguments to plain LoadHeap/StoreHeap ops (so LoadGlobal(a) => LoadHeap(Global(a)) where Global(a) is a const-expr of int32/int64 type). Even trivial backends should have no problem eliminating bounds checks.

With this strategy, the question is of course where the memory pointed to by the global pointers comes from: the engine has no knowledge of how the memory in the [0, sbrk-max) range is being used by malloc et al. There is a lot of room for design here, but I think roughly what we need to do is let the application allocate the data and give it to the engine (either directly or by registering a global data allocator with the runtime). The important thing is that we keep allocation (and addresses) deterministic and under application control so that applications can do smart things with their address space (shadow stacks, emulating MAP_FIXED, etc).

Lastly, I think this same strategy applies to TLS variables: TLS variables would be pointers into the heap and we'd need a way for the user application to allocate the memory pointed to by these TLS variables (when new threads are created or for each thread when new modules are loaded with TLS variables; that's why I feel like these two issues are related and can have symmetric solutions).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions