diff --git a/proposals/call-tags/Overview.md b/proposals/call-tags/Overview.md
index 66f3072..a1e18d4 100644
--- a/proposals/call-tags/Overview.md
+++ b/proposals/call-tags/Overview.md
@@ -4,183 +4,266 @@
## Summary
-Provides a way to create and call untyped functions (`funcref`) such that:
-* calls are efficient (just an additional push, pop, and bitwise equality check compared to a typed function call)
+Provides a way to create and call dynamically typed functions (`funcref`) such that:
+* calls are efficient (just an additional push, pop, and bitwise equality check compared to a typed function call, which has been evaluated to produce no measurable overhead)
* modules can restrict how their functions can be called indirectly (e.g. ensure no other module can indirectly call a particular function)
+* `funcref`s can handle *multiple* signatures, and unhandled signatures can have a custom fall-back behaviors
+* JavaScript functions can handle *multiple* signatures (rather than being coerced to a single predetermined signature)
* type abstraction and subtyping are respected
Furthermore, in some cases an engine can guarantee that an indirect call will necessarily stay within the same module instance (if it succeeds), enabling various optimizations.
-The `func_switch` extension adds the ability to efficiently use the same `funcref` to support many kinds of calls, including calls with different types of arguments and even different numbers of parameters and results.
+## Overview
-## Problem
+The `call_indirect` instruction specifies a function signature.
+At run time, this signature is checked to match the signature of the function referenced by the provided `funcref`, trapping otherwise.
+This provides a dynamically typed form of function calls.
-`call_indirect` imposes a number of challenges (each discussed in more detail below), both currently and with respect to how WebAssembly might evolve:
-1. Looking at a function's definition alone, it's not apparent if a function can be indirectly called via `call_indirect`.
-One has to do some analysis to check if a reference to that function is made.
-And once a function reference has been made, it can be very difficult to ensure even something as simple as the function is only accessed by its own module.
-This makes it difficult to optimize functions or to ensure they are accessed in a way that respects the module's intended invariants.
-2. If subtyping is added, the signature provided by the indirect caller can be compatible with that of the callee function *without* being identical.
-Unfortunately, structural comparison of signatures at run time is likely too expensive.
-[WebAssembly/function-references#33](https://github.com/WebAssembly/function-references/pull/33) addresses this by requiring signatures to match exactly, but implementers of languages using subtyping have indicated that such a restriction does not suit their needs (e.g. subclasses in Kotlin can refine the signature of superclass methods).
-3. To practically support type imports/exports, `call_indirect` needs to compare caller and callee signatures using the "run-time value" of imported/exported types.
-This can be exploited to convert imported/exported types to and from the type they are supposed to be abstracting.
-Thus `call_indirect` can be used to access information that would otherwise be concealed behind an abstract type, or to create values of an abstract type that would otherwise be unforgeable.
+Call tags are a generalization and extension of this construct.
+The instruction `call_with_tag $call_tag` takes a `funcref` and calls it with the specified call tag.
+The type of the call tag indicates the param and result types of the function.
+`call_indirect $table $functype` is simply the special case `call_with_tag (call_tag.canon $functype) (table.get $table)`, calling the `funcref` fetched from the specified table with the *canonical* call tag for a given function signature.
-## Solution
+Every `func` definition (in theory) has a corresponding `funcref` that is returned by `ref.func`.
+By default, this `funcref` handles the canonical call tag for the functions signature.
+But with this extension, a `func` definition can alternatively specify which call tags the `funcref` should handle (so long as their types are compatible with the function's signature).
+These can be canonical call tags for *multiple* type signatures (compatible with the function's given type signature), and they can include custom call tags either generated by (and specific to) this module instance or imported from other module instances.
+For example, if one specifies no canonical call tags and only non-exported call tags, then one can be guaranteed that the function is only indirectly called by this module instance, reducing the possibility of unintended backdoors.
-One way to implement `call_indirect` is to push all the arguments onto the stack and then to push a signature descriptor onto the stack; the callee then pops that descriptor and compares it for (bitwise) equality with the callee's own signature descriptor.
-If the two match, then the invariants guaranteed by the static type-checker ensure the stack has the arguments expected by the function (and similarly that the caller is expecting the values that will be returned by the function), so the function call proceeds.
-If they don't match, then it traps.
-(This is how SpiderMonkey currently implements `call_indirect`.)
+Notice that a `funcref` can handle multiple call tags, even multiple canonical call tags.
+This is because dynamic typing is implemented by the call*ee* rather than the call*er*, which experiments suggest can be implemented much more efficiently *in the case of function calls*.
+This also opens up the opportunity for a `funcref` to actually provide different behavior for different call tags.
+So this proposal includes a `func_switch ($call_tag $func)*` construct in the code section that creates a new `funcref` that (tail) calls the `$func` corresponding to the used call tag.
+[[ACFG2001]](#acfg2001) used this to implement Java interface methods and found the technique to make interface-method dispatch typically as fast as class-method dispatch, and our recent experiments confirmed that this behavior still holds in modern hardware.
-We can solve all the problems above by generalizing this process *call tags*.
-That is, a call tag is a generalization of these signature descriptors.
+Given all that, what happens when the specified call tag is *not* handled by the `funcref` at hand?
+For canonical call tags, the answer is simply that the program traps.
+But when creating call tags with `call_tag.new $functype` (in the Tags section), one can also specify a `$func` to use as its "fall-back" handler.
+This `$func` must have the same signature as `$functype` *except* also accepting an additional `funcref` so that we can pass the fall-back handler the specific `funcref` that did *not* recognize the call tag.
+If engines are careful with their calling conventions, the fall-back can be switched to with just a jump instruction, making it a very efficient backup.
+[[MT2021]](#mt2021) used this to implement a dynamically typed language and found it to work quite well.
-### Defining Call Tags
+Lastly, because responsibility for dynamic typing falls onto the callee, this proposal provides a simpler efficient API for JavaScript interop.
+One can simply coerce `Callable` JavaScript values to `funcref`s, without specifying any type signature.
+When one of these JS-as-funcref values is called with a call tag, its JavaScript-interop handler is called.
+For canonical call tags, this handler simply performs the coercions of wasm values to JS values prescribed by the existing JS API for the function type at hand, calls the JS Callable with those JS values, and then similarly coerces the returned JS value to wasm values.
+For custom call tags, one can specify (in some manner to be determined) a custom JavaScript-interop handler.
+[[MT2021]](#mt2021) used roughly this approach to implement efficient interop between a Java-like language and a JavaScript-like language.
-Somewhere in one of the module's header sections, the module would declare call tags as in `call_tag.new $ct1 : [i32] -> [i32]`.
-This declaration establishes a new call tag `$ct1` that only the module has access to (unless it exports the tag), and which has associated signature `[i32] -> [i32]`.
-The module could also declare call tags as in `call_tag.canon $ct2 : [i32] -> [i32]` that defines `$ct2` to be the *canonical* call tag for the signature `[i32] -> [i32]`.
-This canonical call tag is one that all modules have access to and is uniquely determined by the signature. (Historical note for understanding comments below: `call_tag.canon` was formerly named `call_tag.get`.)
+## Design
-### Associating call tags
+### Tags
-When a module defines a function it can specify the call tag(s) that its associated `funcref` should accept.
-(By default, specifically the result of `dispatch.canon` on the signature is associated, making this backwards compatible.)
-So a module can explicitly specify no tags if it doesn't want the function to be implicitly indirectly callable, or some `new` tag if it only wants it to be indirectly callable with the module's own tag, or multiple tags if it wants to mix-and-match dispatching conventions.
-Of course, these associated call tags have to be compatible with the function's signature.
+In the Tags section, we add the following instructions for declaring/creating tags:
+* `call_tag.canon $functype` derives the canonical call tag of type `$functype` from the function type `$functype : [ti*] -> [to*]`, where all of `ti*` and `to*` are "concrete" (discussed [below](#abstraction))
+* `call_tag.new $functype $func?` generates a new call tag of type `$functype = [ti*] -> [to*]` optionally with `$func` as its fall-back handler, where `$func : [ti* funcref] -> [to*]`
-### Calling with call tags
+Similarly, one can import and export call tags.
-`call_indirect` would then have a variant, say `call_funcref $ct` that specifies what call tag `$ct` to use.
-If the call tag `$ct` has associated signature type `[t1*] -> [t2*]`, then `call_funcref $ct` has type `[t1* funcref] -> [t2*]`.
-Thus the provided inputs and expected outputs are type-checked to match the signature associated with the call tag.
-The current `call_indirect` becomes just a shorthand for first getting the `funcref` from the appropriate table and then using `call_funcref` with the call tag resulting from `call_tag.canon` on the expected signature.
+### Functions
-The execution of `call_funcref $ct` on a given `funcref` for some function succeeds when the call tag `$ct` is one of the function's specified call tags.
-So if the function specified just one call tag, then a simply bitwise-equality check is done on the two tags at run time.
-If a match is found, then this *implies* that the arguments are valid inputs to the function and that the returned values are acceptable outputs from the function.
-However, by making call tags explicit, it also becomes clear that compatibility of inputs and outputs *does not imply* the call will succeed because the call tags might (intentionally) not match even if they have the same signature.
+When defining a `func` of type `[ti*] -> [to*]`, one can optionally specify `(call_tag $call_tag*)`, where each `$call_tag`'s type must be a supertype of `[ti*] -> [to*]`.
+If so, the `funcref` returned by `func.ref` for this `func` handles exactly the call tags in `$call_tag*`.
+
+`func_switch` is a new way of defining of functions (i.e. an alternative to `func`).
+The new function has no type and cannot be directly called, but we can get a `funcref` for it by using `func.ref`, with the expectation that it later gets called using `call_with_tag`.
+The grammar is `func_switch ($call_tag $func)* $func_switch?`, specifying essentially a switch statement that calls a `$func` if the given call tag matches the corresponding `$call_tag`.
+If there is no corresponding call tag, then if `$func_switch` is specified the call tag and arguments are forwarded to it, otherwise the fall-back handler of the call tag is (tail) called with the arguments.
+
+### Instructions
+
+`call_with_tag $call_tag : [ti* funcref] -> [to*]`, where `$call_tag : [ti*] -> [to*]`, calls the given `funcref` with the specified call tag using the values on the stack as the arguments.
+
+`call_indirect_with_tag $table $call_tag : [ti* i32] -> [to*]` is shorthand for `(call_with_tag $call_tag (table.get $table))`.
+(And `call_indirect $table $functype` is now shorthand for `(call_with_tag (call_tag.canon $functype) (table.get $table))`.)
+
+`call_return_with_tag $call_tag : [ti* funcref] -> [to*]`, where `$call_tag : [ti*] -> [to*]`, tail calls the given `funcref` with the specified call tag using the values on the stack as the arguments, where `[to*]` is a subtype of the result type of the function containing this instruction.
+(This is for engines that support `call_return` from the [Tail Call](https://github.com/WebAssembly/tail-call) proposal.)
+
+## Implementation
+
+To implement `call_with_tag`, the arguments can be placed in registers and on the stack according to the calling convention prescribed by the call tag's signature type, except that the call tag's value takes the standard place of the first argument (which is moved to some remaining place).
+Then the code pointer in the `funcref` is called/jumped to.
+That code then switches on the value of the call tag, jumping-to/tail-calling the appropriate function while leaving all the arguments in their place except for moving the first argument.
+If the call tag is not recognized, then the code jumps-to/tail-calls the fall-back handler pointed to by the call tag, leaving all the arguments in their place but replacing the call-tag value with the value of the current `funcref`.
+
+[[MT2021]]](#mt2021) implemented call tags in LLVM by fixing an argument size used by all call tags.
+If a call tag's arity exceeded the argument size, then the remaining arguments were stored on the stack, and the last argument was instead a pointer to this stack-allocated tuple.
+It is unknown what the overhead of this approach to large-arity call tags is, because for their experiments no call tag's arity exceeded the fixed argument size.
+They did however measure the overhead of this LLVM-compilation approach for small-arity call tags compared to using typed function references, and they found that there was *no* overhead.
## Applications
### Security
-To see one way this can be useful for WebAssembly, suppose I am a security-conscious application.
-I want to be very conscious about all the entry points for my program. `call_indirect` is an opportunity for an unintended entry point.
-I can try to mitigate that by being careful about the function references that escape my program, but leaks happen, and it would be better to just not have these unintended entry points to begin with.
+Many applications want to limit the entryways into their code base.
+At the same time, common features rely on function references.
+At present, and especially with some proposed extensions like `anyref`, once a reference to a function is created it can be very difficult to statically ensure the function is only called from within the application.
+Call tags can easily provide applications with this assurance: rather than using canonical call tags, the application can use solely custom non-exported call tags.
+This guarantees that the only calls to a function that can be made through function references comes from `call_with_tag` instructions within the module.
-One option is to make it possible to not have any call tags associated with any of my functions.
-However, that means even I can't `call_indirect` my own functions, which would be a huge problem for many C/C++ programs.
-So this proposal provides the option to use `call_tag.new` for every signature I use and associate each of my functions with the respective new tag.
-Then, so long as I don't explicitly export those call tags, I know I'm the only one who can `call_indirect` into my functions. Furthermore, if I have subcomponents of my program that are supposed to be separable, I can even make new tags at a finer granularity for each subcomponent to ensure that a `call_indirect` in one subcomponent cannot call into another subcomponent.
-Thus, the proposal makes it easier for me to restrict the scope of `call_indirect`, both from other modules and even within my own module.
+### Performance
-### Optimization
+There are a number of ways in which call tags enable more efficient implementations of features than those expressible with just typed function references.
-If a function has no associated call tags, then it is only directly callable, enabling many optimizations.
-Even if a function does have an associated call tag, if that tag is neither imported nor exported than the optimizer can guarantee that the function can only be successfully called from within the current module instance, and furthermore it can only can be indirectly called from occurrences of `call_funcref` using one of the associated call tags.
-Similarly, a `call_funcref` using a call tag that is neither imported nor exported eliminates the possibility that a successful call might require an module-instance switch.
+#### Interface-Method Dispatch
-### Subtyping
+Due to multiple inheritance of interfaces, interface-method dispatch can be quite challenging to implement efficiently.
+With only typed function references, you can only express interface tables.
+With interface tables, every object's (class) descriptor has an array of interfaces that it implements.
+To invoke a method from interface `Foo`, one scans this array for an entry corresponding to `Foo`, and then casts the entry to a structure that provides the typed function references to the object's implementations of each of `Foo`'s methods.
+This involves a loop and a substantial number of chained loads (you have to load each entry from the array, and then load the interface identifier from each entry).
+While flattening this array an embedding it directly into the object descriptor can improve performance substantially by ensuring the search is entirely in the L1 cache, the GC proposal does not support such advanced object representations, leading to a bunch of random loads.
-When a function specifies associated call tags, the types of those call tags need only be *compatible* with the functions signature.
-Compatibility here can easily incorporate subtyping.
-That is, so long as a call tag's input types are subtypes of the input types of the function, and the output types of the function are subtypes of the output types of the call tag, then an indirect call using that call tag is completely sound.
-This then supports the surface-level feature of various typed OO languages where a subclass/subinterface can refine the signature of a method to either accept a broader range of inputs or (more often) produce a more precise output.
+In light of these issues with interface tables, [[ACFG2001]](#acfg2001) developed a different implementation strategy, which they called interface-method tables.
+In terms of call tags, the strategy works by first associating a call tag and some non-negative integer smaller than, say, 19 with every method of every interface, spreading methods of the same interface as evenly across the 19 viable numbers as possible.
+In the object descriptor, there are then 19 (immutable) `funcref` fields.
+When creating the object descriptor for the class `Bar`, one assigns the `func_switch` to the `funcref` for field `i` that switches on the call tags for each of the interface methods associated with index `i` that `Bar` implements, calling `Bar`'s corresponding implementation of each such interface method.
+To call an interface method with associated call tag `$baz` and index `k` on a given object, one gets the `k`th `funcref` from the object's descriptor and performs `call_with_tag $baz` on it.
+They found that, except in rare scenarios where a class implements many interface methods associated with the same index, this technique makes interface-method dispatch perform just as efficiently as class-method dispatch.
+Our more recent experiments found that this still holds on modern hardware.
-### Abstraction
+If `Bar` extends another class `Baz`, each of `Bar`'s' `funcref`s can forward unhandled call tag's to `Baz`'s corresponding `funcref`.
+This can be used to reduce duplication (so that `Bar` isn't also switching on all the interface methods whose implementations are inherited from `Baz`) and to support separate compilation (so that `Bar` does not need to have complete knowledge of all the interfaces that `Baz` implements).
+
+#### Signature Refinement
+
+In languages like Dart and Kotlin allow a `class Super { Object foo() {...} }` to have a subclass `class Sub extends Super { Bar foo(Object o) {...} }`, where the subclass is allowed to refine the signature of `foo` in the superclass (in this case, taking advantage of covariance of return types).
+This poses a type-checking problem for wasm: for `Sub`'s refined v-table's type to be a subtype of `Super`'s v-table's type, one needs structural subtyping of typed function references, which unfortunately is undecidable with extensions that would be useful for eliminating superfluous casts [[P1992]](#p1992), such as existential types for eliminating casts of `this` on class/interface methods and/or universal types for supporting generics.
+
+With call tags, we can avoid the subtyping issue entirely because `funcref`s are *dynamically* typed and yet that dynamic typing comes at no performance cost.
+This means we can easily give call tags very expressive type signatures without running into decidability issues.
+
+But we run into a new problem: dynamic typing is performed using *equality* checks rather than *subtyping* checks.
+To address this, the compiler can generate both a call tag `$Super.bar : [] -> [(ref $Object)]` and a call tag `$Sub.bar : [] -> [(ref $Bar)]` whenever a subclass refines a method's signature.
+The `func` providing `Sub`'s implementation of `bar` could then specify both `$Super.bar` and `$Sub.bar` as its call tags, since both of these call tag's types are supertypes of the `func`'s signatures.
+
+#### Representation Specialization vs Signature Refinement
+
+Now consider a class like `class Super { void bar(Int i) {...} }` in Dart or Kotlin.
+One would like to use unboxed ints when calling this method.
+But, due to contravariance of parameter types, it's possible that there's a subclass `class Sub { void bar(Object o) {...} }` refining the type of `bar` to take an arbitrary object.
+This subclass needs the argument to be boxed into the uniform representation.
+So, especially in the presence of incremental rather than whole-program compilation, does the possibility of that subclass mean that `Super.bar` needs to implemented boxed ints?
-Many of the above languages would want modules to be able to export a class `C` without exporting its implementation (or at least its private fields). call tags are designed to support this pattern and yet avoid the problems outlined in [#1343](https://github.com/WebAssembly/design/issues/1343) wherein `call_indirect` can be used to expose and get direct access to a module's concrete implementation of an exported abstract type.
-
-The [Type Imports](https://github.com/WebAssembly/proposal-type-imports/) proposal is still young, so for the sake of conversation suppose that exports are done by (1) specifying a module signature and then (2) specifying how to instantiate the various components of the signature with the module's various definitions.
-So part 1 might say there is a type `C_type` with no specifics about the type, and then part 2 might say that `C_type` represents `ref (struct i32 i32)`.
-
-In this setting, suppose some surface-level interface method `foo` defined in the module conceptually takes a `C` and returns an integer.
-Using the implementation of interface-method dispatch above, the module would define a call tag `$ct_foo_internal` with signature `[object_impl (ref (struct i32 i32))] -> [i32]` (where the `object_impl` is the `this` pointer).
-The module's own implementations of this method are allowed to know the specific implementation of class `C`, which is why this call tag uses the type `(ref (struct i32 i32))` in its signature.
-
-However, although other module's are allowed to provide their own implementations of the surface-level interface method `foo`, they are not allowed to know the implementation of `C`.
-So in the module's exports, part 1 would specify a call tag `$ct_foo : [object_impl C_type] -> [i32]`, and then part 2 would instantiate that tag with `$ct_foo_internal`.
-Internally, this instantiation is valid because `C_type` was instantiated with `ref (struct i32 i32)`, but externally other modules know nothing about `C_type`.
-Nonetheless, they can use `$ct_foo` to invoke `foo` on objects and to provide their own implementations of `foo` just like they can any other interface method.
-
-This pattern enables (controlled) indirect calls across modules, but in a way that respects abstraction.
-In particular, unlike in [#1343](https://github.com/WebAssembly/design/issues/1343), this design can prevent using indirect calls to get direct access to a module's implementation of an abstract type or to forge values of an abstract type.
-There is just one restriction that needs to be made: `call_tag.canon` must only be allowed for signatures comprised solely of *uninstantiable* types, such as `i32`, `i64`, `f32`, `f64`, and—per the discussion in [#1343](https://github.com/WebAssembly/design/issues/1343)—`externref`.
-That is, it is the ability to generate canonical call tags from abstract types that violates abstraction.
-(This observation seems to extend more generally to processes that would generate canonical values from types where the values are equal only if the types are equal.)
-
-## Extension: Switching
-
-In addition to *direct* functions, we could let a module declare a number of *switch* (or indirect?) functions that look like the following:
-```
-(func_switch $fr
- (on_call_tag $ct1 $f1)
- (on_call_tag $ct2 $f2)
- ...
- (trap)
-)
-```
-This defines a `funcref` (whose index is `$fr`), not a `func`.
-An indirect call to `$fr` checks if the call tag provided at run time is (bitwise) equal to any of `$ct1`, ....
-If it matches `$ctn`, then a direct call to `$fn` is made (in theory; in practice, this call might be inlined).
-If no match is found, then the indirect call traps.
-Type-checking simply involves checking that the signature of each `$ctn` is compatible with that of `$f1`.
-
-Given a `func_switch`, one makes a `funcref` via `ref.func $fr`.
-That might seem odd because previously `ref.func` took a `func` identifier.
-The disconnect is because `func` is currently doing two things: defining a direct function *and* defining a `func_switch` that calls that function on each of the associated call tags.
-So every `func` identifier is also a `func_switch` identifier, making this change in perspective still backwards compatible.
-
-### Applications
-
-#### Interfaces
-
-At present, `call_indirect` is primarily used for untyped (from wasm's perspective) function calls.
-But we can take the concept of call tags further to support more language features.
-In particular, this same pattern shows up in interface-method dispatch for languages with multiple inheritance of interfaces.
-Interface-method dispatch is a critical feature of various popular languages, and good performance for this feature is often achieved through JITing techniques that WebAssembly is aiming to not rely on.
-
-The `func_switch` extension means that the behavior of a `funcref` can depend on the specific call tag used (beyond just matching versus trapping).
-Using this, we can support an efficient non-JITing implementation of interface-method dispatch (and other lesser-known forms of dynamic dispatch) for many languages with multiple inheritance of interfaces.
-For context, one way to implement interface-method dispatch is to have every v-table have an array of, say, 19 slots, and to assign to every interface method some slot number.
-Unfortunately, it is possible that an object implements multiple interface methods with the same slot number.
-So when a interface-method call is made, in addition to the arguments the caller pushes onto the stack the identifier of the interface method (i.e. its call tag) and then calls the function in the matching slot.
-That function then switches on that identifier and redirect to the appropriate implementation, just like a `func_switch`.
-See [Efficient implementation of Java interfaces: Invokeinterface considered harmless](https://dl.acm.org/doi/10.1145/504282.504291) for more information.
-
-It would be very difficult to implement this pattern without direct support from WebAssembly because two interface methods assigned to the same slot can have completely different signatures, i.e. number and size of arguments.
-So call tags enables an important pattern used in practice to support a feature that is critical for many major languages (specifically Java, Kotlin, and Scala come to mind, though not C# due to its decision to support multiple-instantiation inheritance of generic interfaces).
-
-#### Dynamic Arity
+We can solve this problem using call tags, even in the presence incremental compilation.
+We again generate two call tags, but this time we use the most natural specialized representation for the signature at hand.
+This means we have both `$Super.bar : [i32] -> []` and `$Sub.bar : [(ref $Object)] -> []`.
+Then, rather than `Sub` using a single `func` to implement both call tags for `bar`, we have `Sub` use a `func_switch`.
+For `$Sub.bar`, it simply switches to `Sub`'s implementation of `bar`.
+For `$Super.bar`, it switches to a function that boxes the given `i32` and then (tail) calls `Sub`'s implementation of `bar`.
+This enables `Sub` to bridge the representation gap, but at the same time all the other classes that do *not* refine the signature of `bar` can simply handle `$Super.bar` and use the `i32` without boxing it.
+
+#### Dynamic Invocation
+
+A number of languages and even runtimes provide a way to perform a method invocation on a dynamically typed receiver.
+That is, the invocation of `receiver.foo(arg)` is accepted without knowing the type of `receiver`, and consequently without knowing what class or interface `foo` corresponds to and what type `arg` is expected to have.
+[[MT2021]](#mt2021) implemented this feature by using call tags.
+They associated a call tag and interface-method-table-index with each name-arity pair, whose signature accepted the appropriate number of generic objects and returned a generic object.
+The interface-method table of every (sometimes hidden) class handled the call tags of each name-arity pair for which it had some corresponding method.
+The handler takes care of dynamically resolving overloading and performing any necessary representation conversions and then calls the class's corresponding implementation of the method.
+They found this to be an efficient implementation of dynamic invocation.
+
+They also had some objects that could be dynamically extended with additional methods.
+These additional methods would not be built into the interface-method table at compile time.
+So to support this, the fall-back handler for these call tag would search through the additional-methods dictionary of the object to find a corresponding entry and call it (if there were any).
+
+#### Interop
+
+[[MT2021]](#mt2021) also took the above technique a bit further to make it so that untyped objects could dynamically acquire nominal interfaces in order to provide interop between its JavaScript-like subset and its Java-like subset.
+They did this by making the call tag for interface methods have a fall-back handler that would perform a dynamic invocation on the given receiver, performing the necessary coercions before and after the invocation.
+In this way, an object could dynamically implement an interface method that it was not statically compiled to support.
+At the same time, this incurred no overhead for the case where the object *was* compiled to support the given interface method, whereas other approaches need all interface-method invocations to branch on whether the receiver as "typed" or "untyped" before being able to execute the invocation.
+They found that this implementation of interop performed very well.
+
+#### Closures with Dynamic Arity
In functional languages, a value of type `a -> b -> c` (where each letter is a type variable) is a closure of unknown arity.
Due to currying, it could be a closure expecting an `a` that then will return a closure expecting a `b` that then will return a `c`.
Due to first-class functions, the `c` itself could represent a function type, so it could even be the case that this is a ternary closure that has been curried.
-A functional language could implement closure application by having each closure specify its arity and by having each caller case on this arity.
-Or a functional language could use `func_switch` for a closure's `funcref` and have each caller use a call tag for the arity at hand which then the `func_switch` cases on to provide the appropriate functionality.
-The latter is moderately more efficient, but given the frequency of function applications in functional languages, that moderate improvement would likely be notable.
+Given this, implementing function application is non-trivial because most applications have three cases:
+1. The closure was designed to accept exactly the number of arguments available in the application.
+2. The closure was designed to accept fewer arguments than what is provided in the application.
+3. The closure was designed to accept more arguments than what is provided in the application.
+
+To make things more challenging, resolving the second case involves recursion: the closure needs to be called with the prefix of arguments it can accept, and then the resulting closure needs to be supplied with the remaining arguments, but again the three cases above apply to this second application.
+
+One way to resolve this is, for each arity up to some predetermined maximum, to have a function that performs application of that many arguments.
+Every closure is allocated with a field specify its arity (up to the predetermined maximum), and the function switches on that arity.
+For the arity corresponding to case 1, the function simply casts the closure to the appropriate type to get a typed function reference of the appropriate arity and then (tail) calls it with the given arguments.
+For arities corresponding to case 2, the function does the appropriate cast, calls the closure with the prefix of arguments, and then calls the function for the appropriate arity on the resulting closure with the remaining arguments.
+For arities corresponding to case 3, the function returns a new closure of the appropriate arity waiting for the additional arguments.
+
+This approach involves a lot of function-call overhead (or, if inlined, code duplication), and in specifically WebAssembly it involves a lot of casts.
+Furthermore, it only works up to a predetermined maximum arity.
+For larger-arity closures, either you need to add a fourth case where the closure's typed function reference expects an array of arguments, or you need to translate larger-arity closures to lower-arity closures that return lower-arity closures.
+
+With call tags, we can remove all this overhead as well as the cap on arities.
+Associate a call tag with each arity, and have every closure provide a `funcref`.
+To perform an application, simply use `call_with_tag` on that `funcref` using the call tag for the number of arguments at hand (without worrying about the closure's arity).
+When creating a closure, to handle case 1 make sure its `funcref` handles the call tag of the same arity and simply switches to the implementing function.
+When creating a closure, to handle case 3 make sure its `funcref` *also* handles the call tags of *all* smaller arities, in each case switching to a function that simply allocates a closure waiting for the remaining arguments.
+In order to handle case 2, when creating the *call tags*, give them a fall-back handler that switches on the closure's arity (which is necessarily smaller than the call tag), calls its `funcref` with the call tag and arguments for that arity, and then (tail) calls the `funcref` of the resulting closure with the call tag and arguments for the remaining arity.
+(Note that this means that you *cannot* use canonical call tags for this approach.)
+
+#### Representation Specialization vs Generics
+
+Generics also pose problems for representation specialization.
+That is, you might like to be able to pass surface-level types `Int32` and `Int64` using `i32` and `i64`, but if the function can be used polymorphically, then the values generally need to be represented using the (boxed) uniform representation.
+For example, consider the closure for `(+.) : float -> float -> float`.
+Ideally, when we hand this closure to another function expecting a `float -> float -> float`, we can let the function pass the `float` values to `(+.)` using `f64` (or `i64`) rather than a (boxed) reference type.
+But `(+.)` might be passed to a polymorphic function working with just a `a -> b -> c` function, or the function might be passed a `float -> float -> float`.
+
+Call tags can bridge this representation gap by generalizing the above approach.
+Rather than generating a call tag for each arity, generate a call tag for each representation signature `(ref | i32 | i64)* -> (ref | i32 | i64)`.
+When performing an application, use the call tag for the representation signature that requires the least boxing on the caller's side.
+When creating a closure, have it handle all the applicable `ref` representation signatures, but also have it handle the representation signatures for which avoiding the boxing has clear value. (For example, the `funcref` for `(+.)` would handle all the combinations of `i64` and `ref`.)
+For the call tags using non-`ref` representation signatures, have the fall-back box the values into `ref` and then call the given `funcref` with the corresponding `ref`-only representation signature.
+In this way, the caller can *try* to use a specialized representation, and if the callee does not handle that specialization then the fall-back takes care of doing the boxing the caller would have had to do anyways and defer to the uniform representation that everyone is guaranteed to handle.
+This approach also supports separately adding more specialized representations, like `i128`, in dynamically loaded libraries (though, again, canonical call tags cannot be used).
+
+### JavaScript Interop
+
+At present the JS API forces JavaScript functions to be converted to a single typed function signature in order to acquire a corresponding `funcref`.
+This places a burden on tooling, and it underutilizes JavaScript's dynamic typing.
+With call tags, we can simply convert JavaScript functions straight to dynamically typed `funcref`s.
+Similar to how call tags have fall-back handlers, every call tag has a designated behavior for when it is used on one of these JavaScript `funcref`s.
+By default, that behavior is to simply perform the coercions from wasm values to JavaScript values according to the existing JS API, then call the JavaScript function with those values, and then coerce the resulting JavaScript value to wasm values according to the existing JS API.
+That is, the default behavior is the same as if the JavaScript function had been coerced to have the signature of the call tag.
+This can easily be implemented efficiently by generating this coercion code when the call tag is constructed (or first used in this manner).
+For non-canonical call tags, the generating instance can choose to specify a custom coercion (e.g. lossily converting `i64` to `Number` rather than `BigInt`) in a manner to be determined.
-## Extension: Fall-Back Handlers
+### Optimization
-By default, if you call a `funcref` using a call tag that it does not recognize, the call traps.
-We could extend `call_tag.new` so that, when you create the call tag, you also specify a function that should be called whenever a `funcref` does not recognize the call tag, i.e. its "fall-back" handler.
-That function must have the same signature as the call tag so that it can be given the same arguments and be guaranteed to return values of the expected type.
+If a function has no associated call tags, then it is only directly callable.
+Even if a function does have an associated call tag, if that tag is neither imported nor exported than the optimizer can guarantee that the function can only be successfully called from within the current module instance, and furthermore it can only can be indirectly called from occurrences of `call_with_tag` using one of the associated call tags.
+Both of these help support program analyses and implementation optimizations.
-This has four applications:
-1. You can get more graceful behavior than a trap, e.g. throwing an exception.
-2. You can use this to support deferred loading, i.e. the "fall-back" function prompts the missing functionality to be loaded in.
-3. For dynamically typed object-oriented languages, you can use this to implement support user-specified handling of missing methods. That is, objects would have built-in methods, and the "fall-back" function would kick in whenever the method was not built into the object at creation time, in which case it can look for the method in the "added later" dictionary, and if that fails it can call the object's "missing method" method.
-4. For functional languages, this makes it possible to support *unbounded* dynamic arity. In particular, when you create a closure for a value of type `a -> b -> c` (where each letter is a type variable), the `funcref` in your closure can handle the unary and binary call tags. But if `c` is abstracting a function type, your `funcref` might get called with call tags for higher arity. Without a fall-back, you have to cap the arity so that this `funcref` can `func_switch` on a finite number of cases. With a fall-back, you can make the fall-back handler for an n-ary call tag dynamically look up the arity the closure was compiled with (in this example, `2`), call it with just that many arguments, and then call the returned closure with the remaining arguments (in this example, using the `n-2`-arity call tag).
+### Abstraction
-### Variant
+`call_tag.canon` is restricted to providing a canonical call tag for all "concrete" types, such as primitives or imported concrete nominal types (like `$java.lang.String`).
+This restriction is in place to respect type abstraction, wherein a module wants to be able to export a type abstractly and be assured that others cannot expose what that type is, often for security or changeability purposes.
-Many call tags are specialized/optimized cases of more generic call tags.
-In these cases, it can be useful to pass the `funcref` that didn't recognize the call tag to the fall-back handler so that a specialized call tag can simply call the same `funcref` with the more generic call tag.
+To illustrate the problem, suppose a module imports an abstract type `$abs`, and it maliciously knows `$abs` in fact (currently) represents concrete type `t`.
+If `call_tag.canon` were not restricted, then this module could maliciously convert between `$abs` and `t` using the instructions `(call_with_tag (call_tag.canon ([$abs] -> [t])) (func.ref $id_t)) : [$abs] -> [t]` and `(call_with_tag (call_tag.canon ([t] -> [$abs])) (func.ref $id_t)) : [t] -> [$abs]`, where `(func $func_ref (param t) (result t) (return (local.get 0)))`.
+These instructions work because `call_tag.canon ([$abs] -> [t])` and `call_tag.canon ([t] -> [$abs])` are at run time both equal to `call_tag.canon ([t] -> [t])` when `$abs` happens to be instantiated with `t`.
+
+So the restriction on `call_tag.canon` respects type abstraction.
+Meanwhile, `call_tag.new` has no such restriction (because just canonicalization is what breaks abstraction) and so can be used to support dynamically typed function calls even in the presence of abstract types.
+
+### Simplification
+
+Call tags make it unnecessary for `funcref`s to have casting infrastructure, and they make the type-checking complexity of first-class function types unnecessary (which prior works in this space have found to be problematic).
## Forwards-Compatibility
-Preliminary investigations suggest that call tags will be compatible with features like parameterized (i.e. generic) interfaces and polymorphic functions/methods as well as existential types to eliminate superfluous casts.
+Unlike typed function references, call tags are forwards-compatible with features like parameterized (i.e. generic) interfaces and polymorphic functions/methods as well as existential types (all for eliminating superfluous casts).
+
+## References
+
+[ACFG2001] Bowen Alpern, Anthony Cocchi, Stephen Fink, and David Grove. 2001. Efficient Implementation of Java Interfaces: Invokeinterface Considered Harmless. DOI: [10.1145/504282.504291](https://doi.org/10.1145/504282.504291). [(pdf)](http://yanniss.github.io/521-10/oopsla01.pdf)
+
+[MT2021] Fabian Muehlboeck and Ross Tate. 2021. Transitioning from Structural to Nominal Code with Efficient Gradual Typing. DOI: [10.1145/3485504](https://doi.org/10.1145/3485504). [(pdf)](https://www.cs.cornell.edu/~ross/publications/monnom/monnom-oopsla21.pdf)
+
+[P1992] Benjamin Pierce. 1992. Bounded Quantification is Undecidable. DOI: [https://doi.org/10.1145/143165.143228](https://dl.acm.org/doi/10.1145/143165.143228). [(pdf)](http://www.cse.chalmers.se/~abela/lehre/SS07/Typen/pierce93bounded.pdf)