-
Notifications
You must be signed in to change notification settings - Fork 8.2k
libc: newlib: assert: Override handler for C std assert #26024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libc: newlib: assert: Override handler for C std assert #26024
Conversation
12be85c to
c3621d4
Compare
|
All checks are passing now. Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages. |
c3621d4 to
8a978a9
Compare
Libraries using standard C assert from <assert.h> did not give useful information on assertion fault, as default newlib implementation tries to print assert info to stderr. It causes another assertion in Zephyr. With this patch, the default handler is overridden to use native Zephyr assert macros and give useful info when assertion fails. Signed-off-by: Hubert Miś <[email protected]>
8a978a9 to
1898625
Compare
stephanosio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Libraries using standard C assert from <assert.h> did not give useful
information on assertion fault, as default newlib implementation tries
to print assert info to stderr. It causes another assertion in Zephyr.
I am not seeing that with the Zephyr SDK.
Sample assert output with the default newlib implementation:
assertion "false" failed: file "../src/main.c", line 16, function: main
What toolchain and board are you using?
|
I'm using GCC in Ubuntu 18.04, compiling projects for nRF52840-DK. Perhaps something is missing in my project that makes |
I suppose you are referring to the GCC provided as part of the GNU ARM Embedded toolchain?
the |
If you are using the Ubuntu cc @tejlmand |
Yes, that's correct assumption. I've installed it from https://developer.arm.com/
Not intentionally. It is possible that one of the dependencies of my project did it. I thought that missing assert info is a common problem for newlib users in Zephyr. Apparently, it's not. I'll spend some more time next week to verify what causes problem in my project and provide more info then. |
|
@stephanosio Or is it possible that calling |
That sounds probable, especially if the UART type you are using is either interrupt-driven or async. |
|
What would be the correct way to address this problem? |
@hubertmis, You definitely should not expect POSIX, or POSIX-like functions to work from IRQ context. POSIX (and ANSI C) simply don't define something like "IRQ context", and thus doesn't specify which C/POSIX functions work from it. If you work with (Zephyr) IRQ contexts, you should use Zephyr functions suitable for that. One obvious candidate is printk(). (But mind that its behavior may depend on Zephyr config options, see e.g. #10715 (that particular PR was closed, but I'm not sure if that or similar change may still get thru)). |
|
@pfalcon , Thanks for clarification. In such circumstances it seems to me that the patch that I proposed in this PR is a valid solution to get correct assert info from both thread and IRQ contexts. |
|
@hubertmis: Thanks for looking into this issue. It so happens that recently we indeed find that we need to override some Newlib aspects, to make them more suitable for Zephyr applications, e.g. #26135 . This change is in the same vein, so I imagine it might be needed. But I share concern of @stephanosio that we should be sure that in this particular case this change is needed, and would be an improvement. I'd suggest that such a change should include a sample, Second concern is the actual implementation. This patch predicates it on CONFIG_ASSERT, but what's the default setting for that option? (Please check.) When I write my ANSI C application, I want it behave in predictable ways, based on de-jure or de-facto standard. I don't want it do depend (disclaimer: whenever possible) on the obscure config options of the underlying vendor RTOS. What I expect of assert() is: a) it's enabled by default; b) unless I built my project with -DNDEBUG. The implementation you provide here doesn't agree with those expectations. Also added @cfriedt to reviewers, who seem to have even stronger opinion of how C stdlib should work in Zephyr. |
@hubertmis, Thanks for the response, and I'm not sure about the "IRQ contexts" part. In my view, assert() and "IRQ contexts" are orthogonal notions, not mixing together. In other words, you should not expect that if you call assert() in an IRQ context, you'll get a particular behavior (effectively, it's undefined behavior). Of course, we'd like to make it work in a reasonable way in this case - whenever possible, and not at the expense of the normal usage of assert() (and normal usage includes being enabled by default, outputting to stderr, which may be redirected (we don't readily support that (redirection) in Zephyr as of now, but likely will, as we grow more POSIX functionality)). |
|
I don't think I understand the full picture here. I've seen in documentation following statement:
I understand that libraries and applications, libraries, or kernel modules in Zephyr are expected to use Have I misunderstood the purpose of |
The implementation I provided binds |
|
More info from my investigation. When I run a modified sample to cause assertion fault in ISR I get following message on UART console: with following backtrace: When I apply patch from this PR I get following (expected) output: When I force my program to run assertion fault from a thread I get following (expected) output without the patch: And similar one with the patch applied: |
|
@hubertmis According to the backtrace you provided, the fault is from the What happens is that
Lines 140 to 141 in e39bf29
This effectively makes |
|
One solution to this problem is to replace the following critical section implementation with a spinlock: zephyr/lib/libc/newlib/libc-hooks.c Lines 252 to 267 in 45979da
The impact of using spinlock here should be minimal since it only consists of tiny arithmetic operations. |
I don't think the C library code should be using kernel primitives like spinlocks at all. If you put spinlocks here it will break any user mode code.
Let's step back for a second. I don't think this code should be safe from an ISR. Why would we need this to work from an ISR? ISR is a very special execution environment, only a limited subset of functions should be available, with lots of things not possible...to include any code which is making any potentially blocking calls to get more heap memory! k_mem_pool() and k_heap() exist for managing heaps from kernel code. If ISR code is somehow (either directly or transitively) invoking C library malloc() that's simply wrong. |
@andrewboie Sure, for user-mode implementation, futex should be used.
The simplest and possibly the most sane solution is to declare |
Yes that's the only way to go. Under no circumstances should we expect the entire set of libc functions to work in interrupt context. You cannot do any synchronization without using spinlocks. |
|
Also, in addition to synchronization problems there are also security concerns. The kernel and drivers should never ever use the libc malloc heap for ANY reason. The heap could be deliberately corrupted by user mode to get the kernel to reveal sensitive information, and user threads would have access to all such allocations. We would at minimum need a separate brk() arena for the kernel itself, and we would still have synchronization problems even if that were the case. |
@andrewboie IIUC, @hubertmis is having this problem in his (possibly common; as in, can be called from both thread and ISR contexts) application code that is called by one or more callback functions that happen to be called from an ISR. In that sense, we should probably look into reworking how callbacks function in Zephyr (e.g. instead of directly calling the callback function inside an ISR, we should be reserving a worker thread and dispatching callbacks to the worker thread's work queue). We can even argue the current callback scheme in itself is a security risk. |
Agreed, there is much work to do in this area. With a proper kernel/application split, installing callbacks becomes deeply problematic. Some thoughts about this here: #25950 I don't have a good sense of a solution yet, but I'd like it to not have any footprint implications for code that isn't using any memory protection, basically the tiniest of micro-controllers. |
|
One possible idea to consider: if (and only if) synchronization is only done with spinlocks, and we don't use memory protection, then a larger subset of libc functionality might be workable in ISRs (as long as it doesn't sleep). It may be the case that we might macro some current usages of sys_sem (or similar usermode friendly locking primitives) in libc code with spinlocks if CONFIG_USERSPACE=n. That would make life easier for people who don't care about user mode considerations. Would need to be done very carefully. IIRC, a sys_sem and a k_sem are the same thing if CONFIG_USERSPACE=n. So for this particular issue we would need to use a spinlock instead. I am unsure how far down this rabbit hole we want to go, @andyross should be part of the conversation. The problem with spinlocks, of course, is that they impact system latency so we prefer other kinds of locking if feasible. |
That's not exactly true in my application. It looks that calls from the driver are moved from ISR to a thread: zephyr/drivers/ieee802154/ieee802154_nrf5.c Line 624 in 3f8bdf1
I think the problem here is that the driver itself uses We cannot rewrite the driver to use Zephyr macros, as this driver is OS independent. Would it be possible to declare |
|
I think that the example I was using for my test might be heapless. Even if we made it possible to use If I understand what you suggested, there are only two ways to solve:
I think that option 2 is easier to achieve. |
well, this turned out to be a hairier situation than I thought.
This looks to be the most sane solution indeed. After all, we allow C library functions like |
| /** | ||
| * | ||
| * @brief Handler of C standard library assert. | ||
| */ | ||
| void __assert_func(const char *file, | ||
| int line, | ||
| const char *func, | ||
| const char *failedexpr) | ||
| { | ||
| __ASSERT_LOC(...); | ||
| __ASSERT_MSG_INFO("Assertion fail: (%s) at %s:%d", | ||
| failedexpr, file, line); | ||
| __ASSERT_POST_ACTION(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having this function in a separate file, given that it is unlikely we will ever add more assert-related stuff here, can we move it into libc-hooks.c as with the rest of the newlib hook functions?
That only makes the situation worse IMHO. Not only assert() doesn't behave as expected re: outputting to stderr, etc., how it behaves now depends on (generally unrelated) config option, i.e. if a user flips that option, application behavior noticeably changes (for no apparent to a user reasons). |
@hubertmis, so you pinpointed the problem. Instead of I.e., allow users of this driver to override definition of NORDIC_NRF_ASSERT(). In case of Zephyr, you would I hope the situation is clear now, and I would be ready to give -1 to this patch. |
pfalcon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the discussion, the proposed patch is intended to work around an issue in a Nordic nRF BSP (or standalone driver). It's not a good approach to work around problems with out-of-tree drivers in the OS code in general, and this particular patch would affect other (legitimate) users of the ANSI C assert() function.
Given that @hubertmis is from Nordic, and Nordic as among Zephyr founding members, I'm sure a better solution can be found (one suggestion is in a comment above).
Otherwise, following @stephanosio, I tested how assert() works currently, and it works as expected for both ZEPHYR_TOOLCHAIN_VARIANT=zephyr and ZEPHYR_TOOLCHAIN_VARIANT=gnuarmemb.
So, let me give -1 on this, and call to look for a different solution.
Note that if circumstances are such that a proper solution will require some time, and an interim workaround is required (which is known to happen), I'd suggest:
- To make a separate, and clearly named config option for the workaround, e.g. CONFIG_WORKAROUND_NORDIC_BSP_ASSERT, to make clear the purpose and scope of this option.
- To keep it in a separate file, to easier (without extra git noise) remove once it's no longer needed.
Thanks.
|
@pfalcon, I understand your concern. We can update the 802.15.4 driver within reasonable timeframe and because of that I think no workaround is needed. The issue will be still visible if any other driver uses C std We'll prepare an updated 802.15.4 that does not call |
Thanks!
Well, then they will hopefully find this ticket by search, or will be referred to it by other developers. And we can see how common the problem is and whether it really makes sense to provide some workarounds on Zephyr side. |
Libraries using standard C assert from <assert.h> did not give useful
information on assertion fault, as default newlib implementation tries
to print assert info to stderr. It causes another assertion in Zephyr.
With this patch, the default handler is overridden to use native Zephyr
assert macros and give useful info when assertion fails.