-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
GH-139914: Handle stack growth direction on HPPA #140028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Please do so. A regression fix needs to be announced so that users know that their bug has been indeed fixed. |
Python/ceval.c
Outdated
uintptr_t here_addr = _Py_get_machine_stack_pointer(); | ||
_PyThreadStateImpl *_tstate = (_PyThreadStateImpl *)tstate; | ||
#ifdef __hppa__ | ||
if (here_addr <= _tstate->c_stack_soft_limit - margin_count * _PyOS_STACK_MARGIN_BYTES) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we pre-compute the bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The immediate answer is: No because margin_count
is a function parameter.
However... This is always set to 1, so it could be removed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant to avoid computing twice _tstate->c_stack_soft_limit - margin_count * _PyOS_STACK_MARGIN_BYTES
to avoid having a double ifdef
in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_Py_InitializeRecursionLimits()
can change the limits if c_stack_hard_limit == 0
(I assume that's the uninitialized state)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can pre-compute margin_count * _PyOS_STACK_MARGIN_BYTES
(used 4 times in the source) potentially saving a multiply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that's the uninitialized state
That's also something I actually find weird. If we're in an uninitialized state, how come can we use the soft limit? anyway, let's leave the code as is as it could actually be less readable in the end and introduce a subtle issue if _Py_InitializeRecursionLimits
could change the soft limit value at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think of it, we only hit that second if block if we pass the first one. Given that we're reversing the direction of the comparison, on HPPA we won't get here if c_stack_soft_limit
is 0.
@markshannon: Why do we call _Py_InitializeRecursionLimits
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this in an API function, and the stack limits might not have been initialized.
Maybe we are being overly conservative, but it is harmless to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we risk never reaching never reaching the initialisation check on most platforms, if c_stack_soft_limit
is 0? (And possibly on HPPA too, due to underflow of an unsigned int)
It seems like this whole first if
block should just be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general implementation looks sound.
Rather than mysterious #ifdef __hppa__
could you #define STACK_GROWS_DOWN
in the config and use #if STACK_GROWS_DOWN
for clarity and to keep things tidy in case any other platform has a stack the grows up.
OOI what is hppa? Is this PA-RISC, or some new variant?
Python/ceval.c
Outdated
uintptr_t here_addr = _Py_get_machine_stack_pointer(); | ||
_PyThreadStateImpl *_tstate = (_PyThreadStateImpl *)tstate; | ||
#ifdef __hppa__ | ||
if (here_addr <= _tstate->c_stack_soft_limit - margin_count * _PyOS_STACK_MARGIN_BYTES) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this in an API function, and the stack limits might not have been initialized.
Maybe we are being overly conservative, but it is harmless to check.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Yes, it is. Still going as Debian port. The newest hardware is 20 years old at this point, but the port is still alive. It'll probably keep going as long as GCC and Linux support it. |
Thank you for upstreaming the patch! Do the tests pass on HPPA? I would think that I've sent a PR to your branch to expose This PR will conflict with #139668, which will take some time to approve as it adds new API. If this PR is not urgent, I think it's best to wait for #139668, then rebase this one and ask you to check the result. Does that sound OK? |
I had run the tests and seen enough failures that I assumed HPPA was already in a bad state re tests. I'm afraid we don't run them automatically in Debian due to limited hardware capacity. |
We have some skips on our end for HPPA but it's in a reasonable state overall. I haven't rechecked with this patch though. (I just mention this in case someone stumbles upon it in future and thinks Python is broken there, as it isn't.) |
Misc/NEWS.d/next/Core_and_Builtins/2025-10-13-13-54-19.gh-issue-139914.M-y_3E.rst
Outdated
Show resolved
Hide resolved
fi | ||
AC_SUBST([MULTIARCH_CPPFLAGS]) | ||
|
||
# Guess C stack direction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ‘guess’ the right word here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. For an arbitrary platform we can't really be sure.
Adapted from a patch for Python 3.14 submitted to the Debian BTS by John https://bugs.debian.org/1105111#20 Co-authored-by: John David Anglin <[email protected]>
@stefanor please don't force-push, per guidance in devguide etc. A |
I checked the test state more thoroughly. The 3.13 branch (08a2b2d) only has a single failure, but there's a lot in this branch. It looks like we have a GC bug to find. https://gist.github.com/stefanor/393e358310b4dd791f6cbd2397fdcd58 |
Sorry, I try to keep my changes as a readable series of commits. I forgot the local rules here :) |
Bisection of |
Much better:
|
While looking at python#140028 I found some unrelated test regressions in the 3.14 cycle. These seem to all come from python#130317. From what I can tell, that made Python more correct than it was before. According to [0] HP PA RISC uses 1 for SNaN and thus a 0 for QNaN. Update tests to expect this. [0]: https://grouper.ieee.org/groups/1788/email/msg03272.html
While looking at python#140028 I found some test failures that are caused by new tests (from python#138122) running too slowly. This adds an arbitrary heuristic to double the sampling run time. We could do 10x instead? And/or move the heuristic into test_support. Thoughts?
While looking at python#140028 I found some test failures that are caused by new tests (from python#138122) running too slowly. This adds an arbitrary heuristic to double the sampling run time. We could do 10x instead? And/or move the heuristic into test_support. Thoughts?
While looking at python#140028 I found some test failures that are caused by new tests (from python#138122) running too slowly. This adds an arbitrary heuristic to 10x the sampling run time (to the default value of 10 seconds). Doubling the 1-second duration was sufficient for my HP PA tests, but Fedora reported one of the 2-second durations being too slow for a freethreaded build. This heuristic could move into test_support. Thoughts?
While looking at #140028, I found some unrelated test regressions in the 3.14 cycle. These seem to all come from #130317. From what I can tell, that made Python more correct than it was before. According to [0], HP PA RISC uses 1 for SNaN and thus a 0 for QNaN. [0]: https://grouper.ieee.org/groups/1788/email/msg03272.html
While looking at pythonGH-140028, I found some unrelated test regressions in the 3.14 cycle. These seem to all come from pythonGH-130317. From what I can tell, that made Python more correct than it was before. According to [0], HP PA RISC uses 1 for SNaN and thus a 0 for QNaN. [0]: https://grouper.ieee.org/groups/1788/email/msg03272.html (cherry picked from commit 76fea5596c235a7853cda8df87c3998d506e950c) Co-authored-by: Stefano Rivera <[email protected]>
…40467) gh-130317: Fix SNaN broken tests on HP PA RISC (GH-140452) While looking at GH-140028, I found some unrelated test regressions in the 3.14 cycle. These seem to all come from GH-130317. From what I can tell, that made Python more correct than it was before. According to [0], HP PA RISC uses 1 for SNaN and thus a 0 for QNaN. [0]: https://grouper.ieee.org/groups/1788/email/msg03272.html (cherry picked from commit 76fea55) Co-authored-by: Stefano Rivera <[email protected]>
Adapted from a patch for Python 3.14 submitted to the Debian BTS by John David Anglin https://bugs.debian.org/1105111#20
The backport to the 3.14 branch is non-trivial as some of the affected code was refactored in 7094f09 but the patch in the Debian BTS was targetted at 3.14 and can be referenced to backport.