Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 3 additions & 19 deletions arch/arm/core/cortex_m/irq_manage.c
Original file line number Diff line number Diff line change
Expand Up @@ -132,33 +132,17 @@ void z_irq_spurious(const void *unused)
#ifdef CONFIG_PM
void _arch_isr_direct_pm(void)
{
#if defined(CONFIG_ARMV6_M_ARMV8_M_BASELINE)
unsigned int key;

/* irq_lock() does what we want for this CPU */
key = irq_lock();
#elif defined(CONFIG_ARMV7_M_ARMV8_M_MAINLINE)
/* Lock all interrupts. irq_lock() will on this CPU only disable those
* lower than BASEPRI, which is not what we want. See comments in
* arch/arm/core/cortex_m/isr_wrapper.c
*/
__asm__ volatile("cpsid i" : : : "memory");
#else
#error Unknown ARM architecture
#endif /* CONFIG_ARMV6_M_ARMV8_M_BASELINE */
/* Disable all interrupts except ZLIs. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_arch_isr_direct_pm() could be called in ZLI and non-ZLI direct ISRs. A race will lead to pm_system_resume() not being called when it should be or corrupt the data accessed inside the function call, right?

On ARMv6, irq_lock() is nothing but disable all interrupts. On ARMv7, cpsid i needs to be used to disable all interrupts

Zephyr Controller calls ISR_DIRECT_PM() here:

Does this PR mean, no ZLI ISR call ISR_DIRECT_PM() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ZLIs must not call any zephyr APIs since it breaks synchronisation :) pm_system_resume() is a zephyr API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be pedantic: Zephyr APIs are not guaranteed to work in a ZLI, and specifically as you point out anything that relies on synchronization is guaranteed not to work (or not to work reliably). But there's lots of good stuff in the Zephyr API that you can use as long as you know what you're doing. Like, Best Practice for communication with a ZLI almost certainly involves the Zephyr atomics API. We've just been too lazy/conservative to try to document where the boundary is.

Copy link
Contributor

@bjarki-andreasen bjarki-andreasen Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, IPC style communication with software IRQs like nordics SWI and shared memory area I think is the only truly safe way of syncing between ZLI and the zephyr kernel :)

Copy link
Member Author

@carlescufi carlescufi Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have never been able to call _arch_isr_direct_pm or pm_system_resume() safely from a ZLI. Starting with the fact that they read and modify kernel structures like _kernel.idle, this was never possible (at least safely). ZLIs can preempt the kernel any time so modifying kernel structures is a no-no. Remember: even though all interrupts were disabled prior to modifying kernel structs and calling pm_system_resume(), it was already too late! The ZLI could have preempted the kernel during an irq_lock() or sched_lock() execution, so you already had a race, regardless of global disabling of interrupts. Then if we go into pm_system_resume() it's even worse: tons of variables are being read and set there. In summary: it does not matter if you disable global interrupts or not, you can never call code from a ZLI that relies on irq_lock() or sched_lock() to prevent races.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordic-krch agreed! Could you follow-up with #17301 so I can add the assert after?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an assert to ensure that ZLI interrupt does not call things that should not be called in that context. For that we would need API for getting current priority. It was proposed long time ago but did not get in #17301

@nordic-krch Could you expand on that? asserts themselves are a zephyr API, so ZLIs are not even allowed to call those, how would you enforce this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that is an issue we can add ZLI_ASSERT that will do while(1) or reset. The thing is that it will be fail immediately.

Copy link
Contributor

@bjarki-andreasen bjarki-andreasen Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does present an issue though, ZLIs are meant to be "above" the zephyr kernel, a place where a user can do "whatever they want", so such a macro would have to be application specific I would think right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if it's just a loop you could simply disable interrupts and spin. This should work for all applications in principle? Then perhaps it should be overridable.

key = arch_irq_lock();

if (_kernel.idle) {
_kernel.idle = 0;
pm_system_resume();
}

#if defined(CONFIG_ARMV6_M_ARMV8_M_BASELINE)
irq_unlock(key);
#elif defined(CONFIG_ARMV7_M_ARMV8_M_MAINLINE)
__asm__ volatile("cpsie i" : : : "memory");
#else
#error Unknown ARM architecture
#endif /* CONFIG_ARMV6_M_ARMV8_M_BASELINE */
arch_irq_unlock(key);
}
#endif

Expand Down
20 changes: 10 additions & 10 deletions arch/arm/core/cortex_m/isr_wrapper.c
Original file line number Diff line number Diff line change
Expand Up @@ -38,21 +38,21 @@ void _isr_wrapper(void)

#ifdef CONFIG_PM
/*
* All interrupts are disabled when handling idle wakeup. For tickless
* idle, this ensures that the calculation and programming of the
* device for the next timer deadline is not interrupted. For
* All non-ZLI interrupts are disabled when handling idle wakeup. For
* tickless idle, this ensures that the calculation and programming of
* the device for the next timer deadline is not interrupted. For
* non-tickless idle, this ensures that the clearing of the kernel idle
* state is not interrupted. In each case, pm_system_resume
* is called with interrupts disabled.
* state is not interrupted. In each case, pm_system_resume is called
* with non-ZLI interrupts disabled.
*/

/*
* Disable interrupts to prevent nesting while exiting idle state. This
* is only necessary for the Cortex-M because it is the only ARM
* architecture variant that automatically enables interrupts when
* Disable non-ZLI interrupts to prevent nesting while exiting idle
* state. This is only necessary for the Cortex-M because it is the only
* ARM architecture variant that automatically enables interrupts when
* entering an ISR.
*/
__disable_irq();
unsigned int key = arch_irq_lock();

/* is this a wakeup from idle ? */
/* requested idle duration, in ticks */
Expand All @@ -62,7 +62,7 @@ void _isr_wrapper(void)
pm_system_resume();
}
/* re-enable interrupts */
__enable_irq();
arch_irq_unlock(key);
#endif /* CONFIG_PM */

#if defined(CONFIG_ARM_CUSTOM_INTERRUPT_CONTROLLER)
Expand Down