Skip to content

Simplify handling of unreachable to match wasm #3062

@kripken

Description

@kripken

Historically Binaryen has had an "unreachable" type for nodes, that basically means "is not exited from normally." This made sense in pre-MVP wasm which effectively had such a type. It's useful for patterns like

if ( .. )
  i32
else
  br somewhere

One arm has an i32, the other is "unreachable" - it never exits normally. So we can easily see that the if is valid, and has type i32.

However, wasm changed before the MVP to remove that type. Binaryen didn't switch because we thought it was better for code transformations to keep the original method.

I've been thinking for a while that maybe that wasn't the optimal decision. The main downsides are that unreachability has been a source of confusion, both in itself and that it's different from wasm. If we do what wasm does, it would still be confusing, but at least it would be the same confusion we have to deal with anyhow.

Performance also concerned me. The unreachable type is efficient in that a pass can just ask "is the left arm unreachable?" to see if something doesn't exit normally - it doesn't need to scan the internals. But in practice we end up running ReFinalize in places to propagate unreachability out - which scans the entire function. I'm not sure we're actually saving work, and it is more complicated.

Another benefit of making this change is that it could make Poppy IR simpler to implement (#3059) just by reducing complexity around unreachable. It could also make it easier to adopt wasp as mentioned there, because we wouldn't need to handle differences between wasm and Binaryen IR on unreachability.

If we make this change, Binaryen IR would still not be identical to wasm. But only one big difference would remain, the stack machine / structured IR issue (which I do not think we should change).

The actual technical changes would be:

  • There is no more unreachable type. Instead, a type is like in wasm in that it indicates the value(s) provided. If we are unreachable, then the type is none, which looks the same as the case of being reachable but just having no output values. For example, return would always have type none (previously it was always unreachable), etc.
  • Passes that care about unreachability can no longer just look at the type. We should add a helper function, isReached or exitsNormally or such (that would require traversing into the node). grep for the unreachable type should find the relevant passes for this.
  • Get rid of ReFinalize, which we no longer need. We may also be able to remove a lot of calls to finalize() as they are often just for unreachability, but that's not urgent or necessary.

It's hard to estimate how much work this would be, since unreachability is complex. It's more than just a few days. Maybe 1-2 weeks? That's significant effort, but I think it will pay for itself in the long run. And at least it's much easier to do today because we have good automated fuzzing/reducing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions