Skip to content

Conversation

@yonsch
Copy link
Contributor

@yonsch yonsch commented Dec 8, 2022

The RP2040's BOOTROM contains ABI functions for working with floating
point numbers and large integers. When using the implementations that
are already in the ROM, there is no need to link these to the user's
binary, thus reducing the binary size.
Moreover, these implementations take advantage of a hardware divider,
thus improving the performance of float division.

Proof of it working
Compiling the following app (with no optimizations):

void main(void)
{
	float a = 3.0f;
	float b = 4.0f;
	double c = 5.0f;
	double d = 6.0f;

	printk("a + b = %f\n", a + b);
	printk("c + d = %f\n", c + d);
}

Yields the following call:

f004 fc80       bl      10004dac <__wrap___aeabi_fadd>

Which is implemented as:

10004dac <__wrap___aeabi_fadd>:
10004dac:       2300            movs    r3, #0
10004dae:       469c            mov     ip, r3
10004db0:       4b0e            ldr     r3, [pc, #56]   ; (10004dec <__wrap___aeabi_fmul+0xa>)
10004db2:       681b            ldr     r3, [r3, #0]
10004db4:       4718            bx      r3

R3 is loaded with a pointer to a lookup table that points to the implementation in ROM. It then loads the address of __aeabi_fadd in ROM, and jumps.

Some pico-sdk drivers call a panic function, originally implemented
as part of the Pico's C runtime. This commit adds a Zephyr compatible
implementation of panic, so that those drivers could be compiled with
Zephyr.

Signed-off-by: Yonatan Schachter <[email protected]>
The RP2040's BOOTROM contains ABI functions for working with floating
point numbers and large integers. When using the implementations that
are already in the ROM, there is no need to link these to the user's
binary, thus reducing the binary size.
Moreover, these implementations take advantage of a hardware divider,
thus improving the performance of float division.

Signed-off-by: Yonatan Schachter <[email protected]>
@yonsch yonsch requested a review from nashif as a code owner December 8, 2022 13:45
@yonsch yonsch added the platform: Raspberry Pi Pico Raspberry Pi Pico (RPi Pico) label Dec 8, 2022
@yonsch yonsch requested a review from tejlmand December 8, 2022 13:48
@pdgendt pdgendt removed their request for review December 21, 2022 15:54
Copy link
Member

@soburi soburi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR mostly good.
But this PR changes the basic calculation implementations.
I think adding a unit test for float calculations would be better.

{
va_list args;

va_start(args, fmt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referencing to original PICO-SDK implementation,
It would be better to check null-pointer here.

return 0;
}

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part should be better to separate commit.

@cesarvandevelde
Copy link

As you mentioned, the bootrom float/double libraries rely on the hardware divider to speed up operations. However, the divider state must be backed up and restored. The FreeRTOS port implements this here. The Pico SDK appears to implement this behavior correctly, but should be verified. Was this tested in a high-preemption environment?

@yonsch
Copy link
Contributor Author

yonsch commented Dec 28, 2022

@cesarvandevelde Thanks for pointing it out, you're absolutely right. Unfortunately, this would require changing the PendSV handler, which is a bit complicated. I'll draft this for now, until a solution is found.

@cesarvandevelde
Copy link

More bad news: Zephyr's linker configuration interferes with float_aeabi.S in such a way that the application fails to link:

modules/hal_rpi_pico/libmodules__hal_rpi_pico.a(float_aeabi.S.obj): in function `__wrap___aeabi_ui2f':
/Users/cesar/git-repos/zephyrproject/modules/hal/rpi_pico/src/rp2_common/pico_float/float_aeabi.S:371:(.text.__wrap___aeabi_ui2f+0x4): relocation truncated to fit: R_ARM_THM_JUMP8 against `.text.__wrap___aeabi_i2f'
collect2: error: ld returned 1 exit status

The zephyr_pre0.map build artefact shows that the two functions are placed too far apart:

 .text.__wrap___aeabi_ui2f
                0x000000001000f5c4        0xa modules/hal_rpi_pico/libmodules__hal_rpi_pico.a(float_aeabi.S.obj)
                0x000000001000f5c4                __wrap___aeabi_ui2f

 .text.__wrap___aeabi_i2f
                0x000000001000864c       0x44 modules/hal_rpi_pico/libmodules__hal_rpi_pico.a(float_aeabi.S.obj)
                0x000000001000864c                __wrap___aeabi_i2f

Commenting the line below works:

pico_wrap_function(__aeabi_ui2f)

Of course, this is not a real solution. There are other instances of cross-referencing in float_aeabi.S. Ideally, the whole file should be placed within its own section, but my linker knowledge does not stretch that far.

In any case, the performance gains are noticable and worth pursuing. Once I got the branch to work, I tried it out on a 128 x 32 px Mandelbrot. Render time decreased from 14.537s to 9.782s (33% faster).

@cesarvandevelde
Copy link

@cesarvandevelde Thanks for pointing it out, you're absolutely right. Unfortunately, this would require changing the PendSV handler, which is a bit complicated. I'll draft this for now, until a solution is found.

I don't think this is necessary. The FreeRTOS port only backs up the divider upon context switch if PICO_DIVIDER_DISABLE_INTERRUPTS is set, which is not the default behavior.

This alternate behavior uses a critical section to guard against concurrent use. I don't quite understand why divider state must be stored in that case though. The critical section would also guard against RTOS task switching while the divider is in use...

https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/6d65558ba01141d7c50ff6f93a4054cc5f16940e/portable/ThirdParty/GCC/RP2040/include/portmacro.h#L79-L84

The default behavior in the Pico SDK is to always save/restore divider state after each use:

https://github.com/raspberrypi/pico-sdk/blob/2e6142b15b8a75c1227dd3edbe839193b2bf9041/src/rp2_common/pico_divider/divider.S#L52-L93

https://github.com/raspberrypi/pico-sdk/blob/2e6142b15b8a75c1227dd3edbe839193b2bf9041/src/rp2_common/pico_float/float_aeabi.S#L150-L173

That should make it safe to use in a pre-emptive environment. However:

  1. It's probably best to put a compile-time assert around PICO_DIVIDER_DISABLE_INTERRUPTS.
  2. If there are concurrency issues, they would be sporadic and would be a nightmare to debug. For that reason, a test suite would be valuable.

@yonsch yonsch marked this pull request as draft January 3, 2023 09:38
@github-actions
Copy link

github-actions bot commented Mar 5, 2023

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Mar 5, 2023
@yonsch yonsch removed the Stale label Mar 5, 2023
@github-actions
Copy link

github-actions bot commented May 5, 2023

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label May 5, 2023
@github-actions github-actions bot closed this May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants