-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Description
Describe the bug
My application is running on a Raspberry Pi Pico. It's been operating successfully for about a year now.
Some of the time-critical portions of the code (signal demodulation) are flagged in the CMakeLists
file as needing relocation to RAM, in order to avoid XIP overhead and performance loss:
zephyr_code_relocate(FILES src/demod.c LOCATION RAM NOKEEP)
zephyr_code_relocate(FILES src/biquad.c LOCATION RAM NOKEEP)
zephyr_code_relocate(FILES src/hdlc.c LOCATION RAM NOKEEP)
zephyr_code_relocate(FILES ../../drivers/dma/dma_rpi_pico.c LOCATION RAM NOKEEP)
I've added some new code to these routines recently. Suddenly, after adding some quite
vanilla code, the app wouldn't start up at all... even the main() routine wasn't being
called. I played around with alternatives, and found that the issue didn't seem to be
the content of the code, only the amount of it.
When I attached a debugger to the Pi (a BlackMagic Debug probe) and fired up GDB, I
found that the application was stuck in a memmove() in the early stages of kernel startup.
Upon examination, GDB showed me what appears to be a memory clobber - a bunch of
instructions in the memmove() code and subsequent routines are showing up as zeros
("movs r0,r0"). With the code having been "gutted" it hangs.
A GDB memory dump seems to show that almost everything from 0x10000000 onwards is
reading as zero... as if the flash has stopped responding.
If I reset the board into the bootloader, and use "picotool" to read back the entire contents
of the flash, it's a byte-for-byte exact match with the "zephyr.bin" created by the build...
so, as far as I can tell, memmove() and the other code is OK in flash. It's just not being
read properly during the early memmove() for some reason... apparently both the XIP
of the instructions, and a GDB memory-dump access are getting zeros for all of these
locations.
I've seen this behavior on two different boards (both with vanilla Raspberry Pi Pico modules in
good working condition). I've seen it after two different "download to flash" techniques
(copying the .UF2 file to the "flash disk", or using picotool to talk directly to the bootloader).
Oddly, if I use GDB to run the image (downloading it directly) it reliably starts up OK, and
the code is readable out of flash without error.
Expected behavior
Code is successfully relocated from flash to RAM, and executes properly.
Impact
Showstopper for continuing to add code which must run from RAM for performance reasons.
Possible workaround is to restructure my code to minimize the number of routines
marked as "must be relocated to RAM".
Logs and console output
$ arm-none-eabi-gdb build/zephyr/zephyr.elf
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from build/zephyr/zephyr.elf...
Available Targets:
No. Att Driver
1 Raspberry RP2040 M0+
2 Raspberry RP2040 M0+
3 Raspberry RP2040 Rescue (Attach to reset!)
0x10016216 in memmove ()
(gdb) bt
#0 0x10016216 in memmove ()
#1 0x10015a0c in z_early_memcpy (dst=<optimized out>, src=<optimized out>,
n=<optimized out>) at /home/dplatt/zephyrproject/zephyr/kernel/init.c:210
#2 0x10012124 in data_copy_xip_relocation ()
at /home/dplatt/zephyrproject/zephyr/apps/tnc/build/zephyr/code_relocation.c:23
#3 0x10011c02 in z_data_copy ()
at /home/dplatt/zephyrproject/zephyr/kernel/xip.c:49
#4 0x1000e8fc in z_prep_c ()
at /home/dplatt/zephyrproject/zephyr/arch/arm/core/cortex_m/prep_c.c:195
#5 0x1000e750 in z_arm_reset ()
at /home/dplatt/zephyrproject/zephyr/arch/arm/core/cortex_m/reset.S:169
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) disassemble
Dump of assembler code for function memmove:
0x1001620e <+0>: push {r4, lr}
0x10016210 <+2>: cmp r0, r1
0x10016212 <+4>: bhi.n 0x10016222 <memmove+20>
0x10016214 <+6>: movs r3, #0
=> 0x10016216 <+8>: cmp r2, r3
0x10016218 <+10>: beq.n 0x1001622c <memmove+30>
0x1001621a <+12>: ldrb r4, [r1, r3]
0x1001621c <+14>: strb r4, [r0, r3]
0x1001621e <+16>: adds r3, #1
0x10016220 <+18>: b.n 0x10016216 <memmove+8>
0x10016222 <+20>: adds r3, r1, r2
0x10016224 <+22>: movs r0, r0
0x10016226 <+24>: movs r0, r0
0x10016228 <+26>: subs r2, #1
0x1001622a <+28>: bcs.n 0x1001622e <memmove+32>
0x1001622c <+30>: movs r0, r0
0x1001622e <+32>: movs r0, r0
0x10016230 <+34>: movs r0, r0
0x10016232 <+36>: movs r0, r0
End of assembler dump.
(gdb) info registers
r0 0x100001a8 268435880
r1 0x100001a8 268435880
r2 0xa0 160
r3 0x23 35
r4 0x3a 58
r5 0x20041f01 537140993
r6 0x18000000 402653184
r7 0x0 0
r8 0xffffffff -1
r9 0xffffffff -1
r10 0xffffffff -1
r11 0xffffffff -1
r12 0x4001801c 1073840156
sp 0x20012de8 0x20012de8 <z_interrupt_stacks+2008>
lr 0x10015a0d 0x10015a0d <z_early_memcpy+6>
pc 0x10016216 0x10016216 <memmove+8>
xpsr 0x1000000 16777216
msp 0x20013f10 0x20013f10 <sys_work_q_stack>
psp 0x20012de8 0x20012de8 <z_interrupt_stacks+2008>
primask 0x1 1 '\001'
basepri 0x0 0 '\000'
faultmask 0x0 0 '\000'
control 0x2 2 '\002'
(gdb) x/64xh memmove
0x1001620e <memmove>: 0xb510 0x4288 0xd806 0x2300 0x429a 0xd008 0x5ccc 0x54c4
0x1001621e <memmove+16>: 0x3301 0xe7f9 0x188b 0x0000 0x0000 0x3a01 0xd200 0x0000
0x1001622e <memmove+32>: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
0x1001623e <memset+10>: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
0x1001624e <__file_str_put+10>: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
0x1001625e <putc+10>: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
0x1001626e <putc+26>: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
0x1001627e <__math_uflowf+2>: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
Environment (please complete the following information):
- OS: Debian Linux
- Toolchain: zephyr-sdk-0.16.8
- Commit SHA or Version used: commit 8469084 (HEAD -> v4.0.0, tag: v4.0.0, origin/v4.0-branch)