Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
188 commits
Select commit Hold shift + click to select a range
795ef49
Basic trace recording
Fidget-Spinner Sep 18, 2025
40bf6c1
WIP generators
Fidget-Spinner Sep 19, 2025
13188a9
refactor to translate on the go
Fidget-Spinner Sep 19, 2025
e63de39
working python startup
Fidget-Spinner Sep 19, 2025
fba9d2d
fix a bug with specializzation
Fidget-Spinner Sep 19, 2025
7192671
Fully working bm_generators
Fidget-Spinner Sep 20, 2025
07542dd
fix jit build
Fidget-Spinner Sep 20, 2025
021fc44
Fix exception tracing
Fidget-Spinner Sep 20, 2025
f886c43
Fix jump tracing
Fidget-Spinner Sep 20, 2025
3066963
Fix handling of ENTER_EXECUTOR
Fidget-Spinner Sep 20, 2025
20b283b
Fix ENTER_EXECUTOR bug
Fidget-Spinner Sep 21, 2025
36554a5
Fix over-tracing bug
Fidget-Spinner Sep 21, 2025
92bba64
fix JIT + debug builds
Fidget-Spinner Sep 21, 2025
2e3ddc1
Fix double-initialization
Fidget-Spinner Sep 21, 2025
aada168
fix exception bug
Fidget-Spinner Sep 21, 2025
7b5c655
Fix dispatch_inlined
Fidget-Spinner Sep 22, 2025
3e9f782
Fix handling of EXTENDED_ARG
Fidget-Spinner Sep 22, 2025
9a66605
Fix chain depth bug
Fidget-Spinner Sep 22, 2025
108ab7f
remove printf
Fidget-Spinner Sep 22, 2025
fac8c74
fix problem with jumping labels
Fidget-Spinner Sep 23, 2025
fd3bb48
Point to previous executor when side-exiting
Fidget-Spinner Sep 23, 2025
96b7bb2
Fix progress needed and warmup
Fidget-Spinner Sep 23, 2025
396818b
Fix unsupported opcode bug, turn off optimizer again
Fidget-Spinner Sep 24, 2025
57f417e
fix branch tracing
Fidget-Spinner Sep 24, 2025
2c603cc
fix branch prediction for real
Fidget-Spinner Sep 24, 2025
02f1fb4
fix non-sstandard C
Fidget-Spinner Sep 24, 2025
dc414a3
Track from JUMP_BACKWARD rather than FOR_ITER
Fidget-Spinner Sep 24, 2025
0ffc2dd
Fix bug where code/func get freed halfway
Fidget-Spinner Sep 24, 2025
2032b9c
add back replaced, move jit tracing env var to
Fidget-Spinner Oct 9, 2025
299a068
Handle recursive tracing and CALL_ALLOC_AND_ENTER_INIT
Fidget-Spinner Oct 9, 2025
8e0fb21
Fix recursive tracing and dynamic exits
Fidget-Spinner Oct 9, 2025
95eee89
Fix handling of EXTENDED_ARG
Fidget-Spinner Oct 9, 2025
6936a38
Just punt on large opargs for now
Fidget-Spinner Oct 16, 2025
a274451
cleanup a little
Fidget-Spinner Oct 16, 2025
f55129e
fix recursive tracing
Fidget-Spinner Oct 16, 2025
71bd27b
comment out debugging
Fidget-Spinner Oct 17, 2025
e834c88
Delete out.txt
Fidget-Spinner Oct 17, 2025
cae8f10
patch the graphviz dump
Fidget-Spinner Oct 17, 2025
39bc819
fix bug with predicted stuff
Fidget-Spinner Oct 17, 2025
ff92937
Properly record the predicted ops
Fidget-Spinner Oct 17, 2025
2589eb0
Re-enable the optimizer
Fidget-Spinner Oct 17, 2025
a2e92a6
Delete hello.gvz
Fidget-Spinner Oct 17, 2025
cbb3ad2
turn off optimizer again (for now)
Fidget-Spinner Oct 17, 2025
9910b65
Turn off optimizer for real, trace through init
Fidget-Spinner Oct 17, 2025
5102ab6
fix a few tests and their exposed bugs
Fidget-Spinner Oct 17, 2025
c0c14b4
Restore the optimizer fully
Fidget-Spinner Oct 18, 2025
7d4f866
invalidate freed code/function objects used for global promotion
Fidget-Spinner Oct 18, 2025
1981f50
Fix tracing
Fidget-Spinner Oct 18, 2025
b879dab
fix tracing completely
Fidget-Spinner Oct 18, 2025
093578c
Separate the tracer out into its own file
Fidget-Spinner Oct 18, 2025
d114944
Cleanup, bugfixes to sys trace
Fidget-Spinner Oct 18, 2025
e50ff65
Whole test suite passing
Fidget-Spinner Oct 18, 2025
54f6cd6
Remove unused buffer
Fidget-Spinner Oct 18, 2025
e3f18e6
Cleanup warnings
Fidget-Spinner Oct 18, 2025
608772f
refactor a little
Fidget-Spinner Oct 18, 2025
1872715
Cleanup
Fidget-Spinner Oct 18, 2025
460fb39
📜🤖 Added by blurb_it.
blurb-it[bot] Oct 18, 2025
8d6f1db
Merge remote-tracking branch 'upstream/main' into tracing_jit
Fidget-Spinner Oct 18, 2025
a8762c2
Disable windows CI for now, simplify
Fidget-Spinner Oct 18, 2025
72c2242
restore non-jit builds
Fidget-Spinner Oct 18, 2025
d76dc85
make mypy happy
Fidget-Spinner Oct 18, 2025
87c0b72
fix linter and mypy?
Fidget-Spinner Oct 18, 2025
24cd7f9
more cleanup to fix CI
Fidget-Spinner Oct 18, 2025
8ae2e4c
Merge remote-tracking branch 'upstream/main' into tracing_jit
Fidget-Spinner Oct 18, 2025
f38ef69
Fix lltrace on jit debug builds
Fidget-Spinner Oct 18, 2025
960d647
Turn off tracing on dynamic exit
Fidget-Spinner Oct 19, 2025
1798ab1
Fix _CHECK_PERIODIC insertion
Fidget-Spinner Oct 19, 2025
681485f
Increase uop length to compensate
Fidget-Spinner Oct 19, 2025
9bb03a8
Handle EXTENDED_ARG
Fidget-Spinner Oct 19, 2025
4b26cde
Handle unstable branches
Fidget-Spinner Oct 19, 2025
d820e22
Don't JIT short traces except if they end in a loop
Fidget-Spinner Oct 19, 2025
00c81fa
revert last 2 changes
Fidget-Spinner Oct 20, 2025
ba64a5b
Support BINARY_OP_INPLACE_ADD_UNICODE
Fidget-Spinner Oct 20, 2025
b00252e
Trace through BINARY_OP_SUBSCR_GETITEM
Fidget-Spinner Oct 20, 2025
754b3b7
Close loops
Fidget-Spinner Oct 20, 2025
6045a67
Specialize on deopt when tracing
Fidget-Spinner Oct 20, 2025
ec2971f
make mypy happy
Fidget-Spinner Oct 20, 2025
d49e367
remedies against trace explosion
Fidget-Spinner Oct 20, 2025
55892a4
lint
Fidget-Spinner Oct 20, 2025
dd0e16f
Fix a bug with where the executors get inserted during EXTENDED_ARG
Fidget-Spinner Oct 20, 2025
d18c1a1
Revert remedies against trace explosion
Fidget-Spinner Oct 20, 2025
7d17741
First half of reviews
Fidget-Spinner Oct 21, 2025
8ebb6cb
Fix naming of things
Fidget-Spinner Oct 21, 2025
c23e591
restore optimizer code
Fidget-Spinner Oct 21, 2025
a62fe40
Clean up macros
Fidget-Spinner Oct 21, 2025
e4f1624
Clean up the cases generator
Fidget-Spinner Oct 21, 2025
a7fcf24
Close loops properly, don't trace into nested loops
Fidget-Spinner Oct 23, 2025
eb73378
fix test
Fidget-Spinner Oct 23, 2025
2b5fe3a
debug changes
Fidget-Spinner Oct 23, 2025
3385420
add comment to CI
Fidget-Spinner Oct 23, 2025
cedd7af
Rewrite the tracing JIT to use a common opcode handler
Fidget-Spinner Oct 23, 2025
676faf8
Fix ifdefs
Fidget-Spinner Oct 23, 2025
cdcce30
Address review of macros
Fidget-Spinner Oct 23, 2025
1a3f129
fix a tracing bug, ifdef out code
Fidget-Spinner Oct 23, 2025
e8fff00
fix JIT builds
Fidget-Spinner Oct 23, 2025
7b2a8ca
regen frozenmain
Fidget-Spinner Oct 24, 2025
abb1757
fix build on non-JIT
Fidget-Spinner Oct 24, 2025
0fee4e9
fix pystats jit build
Fidget-Spinner Oct 24, 2025
ccc7893
Merge remote-tracking branch 'upstream/main' into tracing_jit
Fidget-Spinner Oct 24, 2025
eb970e0
specialization and deopt fixes
Fidget-Spinner Oct 24, 2025
abecfd6
fix test
Fidget-Spinner Oct 24, 2025
8aeabd5
disable tracing on FT
Fidget-Spinner Oct 24, 2025
6bd1541
Fix FT
Fidget-Spinner Oct 24, 2025
a53ca1d
Emit RECORD_DYNAMIC_JUMP_TAKEN automatically
Fidget-Spinner Oct 24, 2025
ce66f3b
Remove TIER2_STORE_IP
Fidget-Spinner Oct 24, 2025
5733455
make mypy happy
Fidget-Spinner Oct 24, 2025
4aab2df
Move specializing ddetection to specialize inst
Fidget-Spinner Oct 24, 2025
ab7527c
Fix the counters
Fidget-Spinner Oct 24, 2025
7e7b240
fix windows builds
Fidget-Spinner Oct 24, 2025
4a4a31f
Support underflow and yield value in the optimizer
Fidget-Spinner Oct 25, 2025
72e1738
fix
Fidget-Spinner Oct 25, 2025
5e17707
Fix a bug with ENTER_EXECUTOR linking
Fidget-Spinner Oct 25, 2025
1e132f0
up the trace length
Fidget-Spinner Oct 25, 2025
bf17539
Change the backoffs to fix nqueens
Fidget-Spinner Oct 25, 2025
7ab76a8
fix no-opt JIT
Fidget-Spinner Oct 25, 2025
86ab7f1
Fix a test
Fidget-Spinner Oct 26, 2025
5f39672
fix up gitattributes
Fidget-Spinner Oct 26, 2025
440ad03
address review
Fidget-Spinner Oct 26, 2025
9cc7999
Address Chris' review
Fidget-Spinner Oct 26, 2025
4b4e857
Address Kumar's review
Fidget-Spinner Oct 27, 2025
a5d918e
Change RECORD_PREVIOUS_INST to a label to save an opcode
Fidget-Spinner Oct 27, 2025
d601256
fix some formatting
Fidget-Spinner Oct 27, 2025
425fd51
fix cg builds, invalidate executors on function deallocation
Fidget-Spinner Oct 27, 2025
1f8c3df
Differentiate the two dependencies
Fidget-Spinner Oct 27, 2025
bdd2123
Stop recursive traces
Fidget-Spinner Oct 27, 2025
5f4f310
fix backoff
Fidget-Spinner Oct 27, 2025
b5a9b07
Merge remote-tracking branch 'upstream/main' into tracing_jit
Fidget-Spinner Oct 27, 2025
8adaf4d
merge from upstream
Fidget-Spinner Oct 27, 2025
e918cb2
Move unpredictable jump detection to the cases generator
Fidget-Spinner Oct 27, 2025
e9e2bb9
Sink LOAD_IP into guards
Fidget-Spinner Oct 27, 2025
01c2d73
fix backoff counters
Fidget-Spinner Oct 28, 2025
1d3aed1
properly restore exponential backoffs
Fidget-Spinner Oct 28, 2025
7bfac26
fix test
Fidget-Spinner Oct 28, 2025
ac0711d
fix backoff for previous exits
Fidget-Spinner Oct 28, 2025
da66058
remove faulty assertion
Fidget-Spinner Oct 28, 2025
692a992
Fix INTERPRETER_EXIT tracing
Fidget-Spinner Oct 28, 2025
84ee07b
Add _GUARD_IP autogenerator
Fidget-Spinner Oct 29, 2025
72368ed
make windows happy
Fidget-Spinner Oct 29, 2025
8e62fd1
Remove check on RESUME
Fidget-Spinner Oct 29, 2025
92cc140
cleanup
Fidget-Spinner Oct 29, 2025
af9ea57
Remove dynamic exit for _FOR_ITER_TIER_TWO
Fidget-Spinner Oct 30, 2025
3ed1402
Remove bytecode object for ceval.c
Fidget-Spinner Nov 1, 2025
0a1bad2
fix mypy
Fidget-Spinner Nov 1, 2025
0498568
regen global objects
Fidget-Spinner Nov 1, 2025
d44ca1e
Merge remote-tracking branch 'origin/main' into tracing_jit
Fidget-Spinner Nov 1, 2025
2dbc291
fix C analyzer
Fidget-Spinner Nov 1, 2025
82ec8f1
convert spaces to tabs
Fidget-Spinner Nov 1, 2025
f526688
Fix a bug in not setting executors
Fidget-Spinner Nov 5, 2025
c78a4e6
Revert "Fix a bug in not setting executors"
Fidget-Spinner Nov 5, 2025
aa92d84
cold dynamic executors
Fidget-Spinner Nov 1, 2025
253f230
fix frame owned by interp
Fidget-Spinner Nov 5, 2025
ffa2b72
Use a different chain depth for dynamic exits
Fidget-Spinner Nov 5, 2025
897edf5
remove two chain depths
Fidget-Spinner Nov 5, 2025
1ef4a37
Make sure we don't reenter executors when guard exec ip fails
Fidget-Spinner Nov 5, 2025
0e92118
remove dynamic tracing for now
Fidget-Spinner Nov 5, 2025
c75d91c
Don't limit control-flow exits
Fidget-Spinner Nov 5, 2025
bf1ddab
fix assertion
Fidget-Spinner Nov 5, 2025
a64215d
Remove _DYNAMIC_EXIT jumping for now.
Fidget-Spinner Nov 6, 2025
da5e7b7
reduce diff
Fidget-Spinner Nov 6, 2025
83c85c3
rework ad-hoc generation of guards
Fidget-Spinner Nov 6, 2025
6c77ee3
fix jit builds
Fidget-Spinner Nov 6, 2025
3fd3ab5
more future-proofing
Fidget-Spinner Nov 6, 2025
6429b2f
lint
Fidget-Spinner Nov 6, 2025
5cfc7a9
special case first instr properly
Fidget-Spinner Nov 6, 2025
c5de275
move strange control flow detection up
Fidget-Spinner Nov 6, 2025
b7b3c23
move code to correcdt places
Fidget-Spinner Nov 6, 2025
cda3dce
fix a bug where we point FOR_ITER_TIER_TWO
Fidget-Spinner Nov 6, 2025
3f212a4
cleanup
Fidget-Spinner Nov 6, 2025
4f29dd3
Partially address review
Fidget-Spinner Nov 7, 2025
5af4b0a
remove nedsguardip table
Fidget-Spinner Nov 7, 2025
46c079a
Cleanup cases generator
Fidget-Spinner Nov 7, 2025
4aeb4ab
massive refactoring of the struct
Fidget-Spinner Nov 7, 2025
fe3a6a1
massive refactoring 2
Fidget-Spinner Nov 7, 2025
10fa14a
reduce diff
Fidget-Spinner Nov 7, 2025
0f978c4
fix windows
Fidget-Spinner Nov 7, 2025
3c80446
make mypy happy
Fidget-Spinner Nov 7, 2025
aaf6873
Move to thread state
Fidget-Spinner Nov 7, 2025
547f587
Update optimizer.c
Fidget-Spinner Nov 7, 2025
a55d766
Update generated_cases.c.h
Fidget-Spinner Nov 7, 2025
278bbe6
Support __init__ in the optimizer
Fidget-Spinner Nov 8, 2025
f547880
Fix a few perf regressions due to tracing thru optimizer
Fidget-Spinner Nov 8, 2025
7e2bc1d
Some fixups
Fidget-Spinner Nov 8, 2025
fa3e285
Clean up labels
Fidget-Spinner Nov 10, 2025
251e19e
fix TC
Fidget-Spinner Nov 10, 2025
08ec600
Remove specialize_counter
Fidget-Spinner Nov 10, 2025
f7c26d4
rename jit_state to jit_tracer_state
Fidget-Spinner Nov 10, 2025
ae1d6fe
Restore a test, address review
Fidget-Spinner Nov 11, 2025
0658f1b
remove CALL_LIST_APPEND fix
Fidget-Spinner Nov 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 14 additions & 12 deletions .github/workflows/jit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,10 @@ jobs:
fail-fast: false
matrix:
target:
- i686-pc-windows-msvc/msvc
- x86_64-pc-windows-msvc/msvc
- aarch64-pc-windows-msvc/msvc
# To re-enable later when we support these.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly because I don't know the time horizon for getting this support, can we add a link in the comments to the issue where we're tracking this? I think it's #139922, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark says he will add normal switch-case support back to this PR by building on top of it I think. So we should get it nearly immediately after this lands.

# - i686-pc-windows-msvc/msvc
# - x86_64-pc-windows-msvc/msvc
# - aarch64-pc-windows-msvc/msvc
- x86_64-apple-darwin/clang
- aarch64-apple-darwin/clang
- x86_64-unknown-linux-gnu/gcc
Expand All @@ -70,15 +71,16 @@ jobs:
llvm:
- 19
include:
- target: i686-pc-windows-msvc/msvc
architecture: Win32
runner: windows-2022
- target: x86_64-pc-windows-msvc/msvc
architecture: x64
runner: windows-2022
- target: aarch64-pc-windows-msvc/msvc
architecture: ARM64
runner: windows-11-arm
# To re-enable later when we support these.
# - target: i686-pc-windows-msvc/msvc
# architecture: Win32
# runner: windows-2022
# - target: x86_64-pc-windows-msvc/msvc
# architecture: x64
# runner: windows-2022
# - target: aarch64-pc-windows-msvc/msvc
# architecture: ARM64
# runner: windows-11-arm
- target: x86_64-apple-darwin/clang
architecture: x86_64
runner: macos-15-intel
Expand Down
17 changes: 15 additions & 2 deletions Include/internal/pycore_backoff.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,24 @@ backoff_counter_triggers(_Py_BackoffCounter counter)
return counter.value_and_backoff < UNREACHABLE_BACKOFF;
}

static inline _Py_BackoffCounter
trigger_backoff_counter(void)
{
_Py_BackoffCounter result;
result.value_and_backoff = 0;
return result;
}

// Initial JUMP_BACKWARD counter.
// Must be larger than ADAPTIVE_COOLDOWN_VALUE, otherwise when JIT code is
// invalidated we may construct a new trace before the bytecode has properly
// re-specialized:
#define JUMP_BACKWARD_INITIAL_VALUE 4095
// Note: this should be a prime number-1. This increases the likelihood of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this comment is true, then we should change the backoff counter to use a table lookup instead of using 2**backoff-1 when setting the counter.
Having the maximum value well less than 4095 will also avoid overflow issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually verified it was true on the nqueens benchmark. it was the main reason why the perf used to be so bad for it.

// finding a "good" loop iteration to trace.
// For example, 4095 does not work for the nqueens benchmark on pyperformance
// as we always end up tracing the loop iteration's
// exhaustion iteration. Which aborts our current tracer.
#define JUMP_BACKWARD_INITIAL_VALUE 4000
#define JUMP_BACKWARD_INITIAL_BACKOFF 12
static inline _Py_BackoffCounter
initial_jump_backoff_counter(void)
Expand All @@ -112,7 +125,7 @@ initial_jump_backoff_counter(void)
* Must be larger than ADAPTIVE_COOLDOWN_VALUE,
* otherwise when a side exit warms up we may construct
* a new trace before the Tier 1 code has properly re-specialized. */
#define SIDE_EXIT_INITIAL_VALUE 4095
#define SIDE_EXIT_INITIAL_VALUE 4000
#define SIDE_EXIT_INITIAL_BACKOFF 12

static inline _Py_BackoffCounter
Expand Down
2 changes: 2 additions & 0 deletions Include/internal/pycore_ceval.h
Original file line number Diff line number Diff line change
Expand Up @@ -392,6 +392,8 @@ _PyForIter_VirtualIteratorNext(PyThreadState* tstate, struct _PyInterpreterFrame
#define SPECIAL___AEXIT__ 3
#define SPECIAL_MAX 3

PyAPI_DATA(const _Py_CODEUNIT *) _Py_INTERPRETER_TRAMPOLINE_INSTRUCTIONS_PTR;

#ifdef __cplusplus
}
#endif
Expand Down
4 changes: 1 addition & 3 deletions Include/internal/pycore_interp_structs.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,6 @@ extern "C" {
#include "pycore_structs.h" // PyHamtObject
#include "pycore_tstate.h" // _PyThreadStateImpl
#include "pycore_typedefs.h" // _PyRuntimeState
#include "pycore_uop.h" // struct _PyUOpInstruction


#define CODE_MAX_WATCHERS 8
#define CONTEXT_MAX_WATCHERS 8
Expand Down Expand Up @@ -934,10 +932,10 @@ struct _is {
PyObject *common_consts[NUM_COMMON_CONSTANTS];
bool jit;
bool compiling;
struct _PyUOpInstruction *jit_uop_buffer;
struct _PyExecutorObject *executor_list_head;
struct _PyExecutorObject *executor_deletion_list_head;
struct _PyExecutorObject *cold_executor;
struct _PyExecutorObject *cold_dynamic_executor;
int executor_deletion_list_remaining_capacity;
size_t executor_creation_counter;
_rare_events rare_events;
Expand Down
71 changes: 39 additions & 32 deletions Include/internal/pycore_opcode_metadata.h

Large diffs are not rendered by default.

41 changes: 25 additions & 16 deletions Include/internal/pycore_optimizer.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,6 @@ typedef struct _PyExecutorLinkListNode {
} _PyExecutorLinkListNode;


/* Bloom filter with m = 256
* https://en.wikipedia.org/wiki/Bloom_filter */
#define _Py_BLOOM_FILTER_WORDS 8

typedef struct {
uint32_t bits[_Py_BLOOM_FILTER_WORDS];
} _PyBloomFilter;

typedef struct {
uint8_t opcode;
uint8_t oparg;
Expand All @@ -44,7 +36,9 @@ typedef struct {

typedef struct _PyExitData {
uint32_t target;
uint16_t index;
uint16_t index:14;
uint16_t is_dynamic:1;
uint16_t is_control_flow:1;
_Py_BackoffCounter temperature;
struct _PyExecutorObject *executor;
} _PyExitData;
Expand Down Expand Up @@ -94,9 +88,8 @@ PyAPI_FUNC(void) _Py_Executors_InvalidateCold(PyInterpreterState *interp);
// This value is arbitrary and was not optimized.
#define JIT_CLEANUP_THRESHOLD 1000

#define TRACE_STACK_SIZE 5

int _Py_uop_analyze_and_optimize(_PyInterpreterFrame *frame,
int _Py_uop_analyze_and_optimize(
PyFunctionObject *func,
_PyUOpInstruction *trace, int trace_len, int curr_stackentries,
_PyBloomFilter *dependencies);

Expand Down Expand Up @@ -130,7 +123,7 @@ static inline uint16_t uop_get_error_target(const _PyUOpInstruction *inst)
#define TY_ARENA_SIZE (UOP_MAX_TRACE_LENGTH * 5)

// Need extras for root frame and for overflow frame (see TRACE_STACK_PUSH())
#define MAX_ABSTRACT_FRAME_DEPTH (TRACE_STACK_SIZE + 2)
#define MAX_ABSTRACT_FRAME_DEPTH (16)

// The maximum number of side exits that we can take before requiring forward
// progress (and inserting a new ENTER_EXECUTOR instruction). In practice, this
Expand Down Expand Up @@ -258,6 +251,7 @@ struct _Py_UOpsAbstractFrame {
int stack_len;
int locals_len;
PyFunctionObject *func;
PyCodeObject *code;

JitOptRef *stack_pointer;
JitOptRef *stack;
Expand Down Expand Up @@ -333,11 +327,11 @@ extern _Py_UOpsAbstractFrame *_Py_uop_frame_new(
int curr_stackentries,
JitOptRef *args,
int arg_len);
extern int _Py_uop_frame_pop(JitOptContext *ctx);
extern int _Py_uop_frame_pop(JitOptContext *ctx, PyCodeObject *co, int curr_stackentries);

PyAPI_FUNC(PyObject *) _Py_uop_symbols_test(PyObject *self, PyObject *ignored);

PyAPI_FUNC(int) _PyOptimizer_Optimize(_PyInterpreterFrame *frame, _Py_CODEUNIT *start, _PyExecutorObject **exec_ptr, int chain_depth);
PyAPI_FUNC(int) _PyOptimizer_Optimize(_PyInterpreterFrame *frame, PyThreadState *tstate);

static inline _PyExecutorObject *_PyExecutor_FromExit(_PyExitData *exit)
{
Expand All @@ -346,6 +340,7 @@ static inline _PyExecutorObject *_PyExecutor_FromExit(_PyExitData *exit)
}

extern _PyExecutorObject *_PyExecutor_GetColdExecutor(void);
extern _PyExecutorObject *_PyExecutor_GetColdDynamicExecutor(void);

PyAPI_FUNC(void) _PyExecutor_ClearExit(_PyExitData *exit);

Expand All @@ -354,7 +349,9 @@ static inline int is_terminator(const _PyUOpInstruction *uop)
int opcode = uop->opcode;
return (
opcode == _EXIT_TRACE ||
opcode == _JUMP_TO_TOP
opcode == _DEOPT ||
opcode == _JUMP_TO_TOP ||
opcode == _DYNAMIC_EXIT
);
}

Expand All @@ -365,6 +362,18 @@ PyAPI_FUNC(int) _PyDumpExecutors(FILE *out);
extern void _Py_ClearExecutorDeletionList(PyInterpreterState *interp);
#endif

int _PyJit_translate_single_bytecode_to_trace(PyThreadState *tstate, _PyInterpreterFrame *frame, _Py_CODEUNIT *next_instr, bool stop_tracing);

int
_PyJit_TryInitializeTracing(PyThreadState *tstate, _PyInterpreterFrame *frame,
_Py_CODEUNIT *curr_instr, _Py_CODEUNIT *start_instr,
_Py_CODEUNIT *close_loop_instr, int curr_stackdepth, int chain_depth, _PyExitData *exit,
int oparg);

void _PyJit_FinalizeTracing(PyThreadState *tstate);

void _PyJit_Tracer_InvalidateDependency(PyThreadState *old_tstate, void *obj);

#ifdef __cplusplus
}
#endif
Expand Down
39 changes: 37 additions & 2 deletions Include/internal/pycore_tstate.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ extern "C" {
#include "pycore_freelist_state.h" // struct _Py_freelists
#include "pycore_mimalloc.h" // struct _mimalloc_thread_state
#include "pycore_qsbr.h" // struct qsbr

#include "pycore_uop.h" // struct _PyUOpInstruction
#include "pycore_structs.h"

#ifdef Py_GIL_DISABLED
struct _gc_thread_state {
Expand All @@ -21,6 +22,38 @@ struct _gc_thread_state {
};
#endif

#if _Py_TIER2
typedef struct _PyJitTracerInitialState {
int stack_depth;
int chain_depth;
struct _PyExitData *exit;
PyCodeObject *code; // Strong
PyFunctionObject *func; // Strong
_Py_CODEUNIT *start_instr;
_Py_CODEUNIT *close_loop_instr;
_Py_CODEUNIT *jump_backward_instr;
} _PyJitTracerInitialState;

typedef struct _PyJitTracerPreviousState {
bool dependencies_still_valid;
bool instr_is_super;
int code_max_size;
int code_curr_size;
int instr_oparg;
int instr_stacklevel;
_Py_CODEUNIT *instr;
PyCodeObject *instr_code; // Strong
struct _PyInterpreterFrame *instr_frame;
_PyBloomFilter dependencies;
} _PyJitTracerPreviousState;

typedef struct _PyJitTracerState {
_PyUOpInstruction *code_buffer;
_PyJitTracerInitialState initial_state;
_PyJitTracerPreviousState prev_state;
} _PyJitTracerState;
#endif

// Every PyThreadState is actually allocated as a _PyThreadStateImpl. The
// PyThreadState fields are exposed as part of the C API, although most fields
// are intended to be private. The _PyThreadStateImpl fields not exposed.
Expand Down Expand Up @@ -75,7 +108,9 @@ typedef struct _PyThreadStateImpl {
#if defined(Py_REF_DEBUG) && defined(Py_GIL_DISABLED)
Py_ssize_t reftotal; // this thread's total refcount operations
#endif

#if _Py_TIER2
_PyJitTracerState jit_tracer_state;
#endif
} _PyThreadStateImpl;

#ifdef __cplusplus
Expand Down
12 changes: 10 additions & 2 deletions Include/internal/pycore_uop.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,18 @@ typedef struct _PyUOpInstruction{
#endif
} _PyUOpInstruction;

// This is the length of the trace we project initially.
#define UOP_MAX_TRACE_LENGTH 1200
// This is the length of the trace we translate initially.
#define UOP_MAX_TRACE_LENGTH 3000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what you said in the PR comment about seeing a lot of "trace too long" aborts even with the higher limit, I'm wondering if you've benchmarked even higher limits like 5k or 10k?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the trace too long aborts are with the old limit (1200). I have not benchmarked the stats for the newer limit. We should definitely gather stats for this in the future and fine-tune it though

#define UOP_BUFFER_SIZE (UOP_MAX_TRACE_LENGTH * sizeof(_PyUOpInstruction))

/* Bloom filter with m = 256
* https://en.wikipedia.org/wiki/Bloom_filter */
#define _Py_BLOOM_FILTER_WORDS 8

typedef struct {
uint32_t bits[_Py_BLOOM_FILTER_WORDS];
} _PyBloomFilter;

#ifdef __cplusplus
}
#endif
Expand Down
Loading