Skip to content

_Swap on ARM is nonatomic by design, can we fix that? #12342

@andyross

Description

@andyross

The way _Swap() is implemented on ARM is nonatomic, which while not incorrect has turned out to be very surprising. It's done with a PendSV exception whose priority sits below that of hardware interrupts, so it's possible for a process to decide to context switch based on atomic state that then gets changed under the interrupt handler before the context switch actually happens.

This trick has resulted in three moderatly excruciating bug hunts so far (c.f. commits 41070c3 and 6c95daf and the current work submitted in PR #12448). In all honestly I doubt that's going to be the last of them.

There's a framework now which will help keep the workarounds (which so far haven't been complicated) tidy, which should help some.

But basically: how wedded are we to this architecture? How hard would it be and what would it break to set the PendSV exception to the maximum interrupt priority such that it can't be interrupted like this? That would match ARM's behavior to that of other systems (who do context switching with more typical musical-chairs register swaps and not in an exception handler), albeit at the cost of higher worst case latencies for high priority interrupts.

IMHO it would make for better reliability on ARM in the long term, and certainly would make my life easier.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementChanges/Updates/Additions to existing featuresarea: ARMARM (32-bit) Architecturearea: Kernel

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions