-
Notifications
You must be signed in to change notification settings - Fork 52
Description
We are adding increasingly uncommon specializations of the LOAD_ATTR
and CALL
instructions.
Each of these specializations increases our hit-rate, improving the potential for tier-2 optimizations, but often has a negligible or even negative impact on tier-1 (PEP 659) performance as the increasing size of the interpreter makes for more icache misses.
Instead of adding more and more specialization we can add trampoline functions (in bytecode, but not full Python functions) to perform some of the work that would be done in a specialized instructions.
These trampolines are unlikely to boost tier-1 performance, in fact they may make it a bit worse, but they allow much more open ended specializations, and expose the details of the specializations to higher tier optimizers in a way that they can understand without needing custom code for each specialization.
This idea isn't limited to CALL
and LOAD_ATTR
, but those have the most complex and varied semantics.
Examples:
-
Calling a Python function with complex arguments or parameters.
When specializing a call, we know both the shape of the arguments and the parameters, allowing us to compute (or lookup for the most common cases) a sequence of operations to move the arguments to the right place in the locals. Having individual specializations for each of these cases would be silly. We could have a custom format describing the transformation packed into the cache, but to do that would need a custom mini-interpreter.
Better to create a trampoline function that does the argument shuffle and then tailcalls into the function. -
Calling a Python class.
GH-91095: Specialize calls to normal python classes python/cpython#93221 specializes calling Python classes using a custom specialized instruction and a stub function to clean up after the call. This could be replaced with a more general trampoline specialized instruction. Doing so would allow trampolines to handle differing shapes of arguments, without a proliferation of instructions.