sys: time_units: Increase range of z_tmcvt #41602

nordic-krch · 2022-01-05T14:11:13Z

Avoid result overflow due to intermediate product overflow.
Algorithm was multiplying input value by target frequency
before dividing it by source frequency. If target frequency was
high (e.g. conversion to nanoseconds) it could easily lead to
overflow even though final result would not overflow. Adjusting
algorithm to avoid that.

Note, that typically this code is resolved at compile time so
it will not impact performance as long as it can be resolved.

Fixes #41111.

Signed-off-by: Krzysztof Chruscinski [email protected]

andyross

Seems not unreasonable. Though regarding the linked bug... honestly doing math in terms of system uptime measured in (!) gigahertz is just going to lead to tears one way or another, whether or not the API does it optimally. Dance too close to a precision limit and you're going to be burned. The intended application of these utilities was for constructing delays to feed to kernel timeouts.

FWIW: the selection of the algorithm happens at compile time, but the operations still have to happen at runtime as "t" is input. So this is adding three operations[1] and effectively cutting performance in half in practice as I read the assembly output.

[1] Though libgcc has a divmod routine (and x86 has DIVMOD as an instruction) so in practice the worst penalty comes "for free" along with the already-existing quotient.

nordic-krch · 2022-01-11T06:49:36Z

@andyross what about limiting this approach to to_hz < BIT_MASK(x) exceeding arbitrary value (e.g.x=20 bits so 1mhz fits in)? That would reduce computation for typical conversions in us,ms range where overflow is unlikely.

andyross · 2022-01-11T20:02:31Z

I think that's reasonable too, you could even (with a little care) express that mask in units of real time via a kconfig, so you'd get a guarantee like "we use the fast form as long as conversions smaller than CONFIG_MAX_TIMEOUT_DAYS are representable without precision loss".

Though it's worth remembering that anyone who selected CONFIG_TIMEOUT_64BIT is already willing to pay some overhead. People who really want to microoptimize timeouts and are willing to deal with precision and rollover at the app level already have a much-faster-still 32 bit representation. With care, you can even use that with GHz values (though not values representing system uptime, heh).

nordic-krch · 2022-01-12T09:42:24Z

Added kconfig option which target frequency threshold. I think that using 20bits ensures that milli/microseconds will be using fast algorithm and it's unlikely that it will be touched by anyone (unless someone converts 48h represented in nanoseconds).

I would like to hide this somewhere but haven't found any suitable kernel section in menuconfig. Looking at menuconfig it might be worth creating System clock options menu and put there all clock related stuff from main menu.

andyross · 2022-01-12T16:57:41Z

Still looks great to me (though I still think doing this as a "maximum time that can be up-converted to the fastest defined clock" instead of a bitmask would be clever).

nordic-krch · 2022-01-13T08:35:48Z

@andyross

I still think doing this as a "maximum time that can be up-converted to the fastest defined clock" instead of a bitmask would be clever

But that would compare against value t and not to_hz. In that case both algorithms would be compiled in when t is not constant. I don't fully understand what is fastest defined clock here?

andyross · 2022-01-13T18:29:04Z

You hit the overflow in the expression from "t * to_hz", the to_hz value is one of the list of specified clocks in the system. So it only overflows if t > 2^64/to_hz. And t is a unit of time in some other clock.

So if your highest clock (cycles) is e.g. 24 Mhz, and your lowest clock (ticks) is 10 kHz, then the largest delay reliably convertible is 2^64/CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC/CONFIG_SYS_CLOCK_TICKS_PER_SEC ~= 77Msec, or 889.6 days.

So I'm proposing that we have a CONFIG_SYS_CLOCK_MAX_TIMEOUT_DAYS value representing the maximum timeout value the application wants to compute without overflow. If (on this particular example system) this is <= 889, then on that system it's permissible to use the fast version of the algorithm.

The math isn't trivial, but it's all computible at compile time. And the advantage is that the app gets a really obvious tunable to control this instead of a somewhat opaque bitmask.

But again I don't see that this should block merge. We can always come back and optimize later.

andyross · 2022-01-13T18:42:39Z

and your lowest clock (ticks) is

That was poorly phrased. Not the lowest clock in the time_units.h header (which would probably be ms, not ticks), but the lower of the two clocks in the specific conversion being generated. You apply that math in every conversion, so if you want to do "ms to ticks" you get the fast version every time and only pay the cost when you're trying to convert GHz cycles.

nordic-krch · 2022-01-14T08:36:25Z

Ok, i think i got that. Applied. Can you check? I used 365 days as default but if #41814 get it i guess that it can be reduced (to 30 days?).

andyross · 2022-01-15T05:05:43Z

include/sys/time_units.h

Think you want a ULL suffix on those constants, otherwise the (kconfig * 24 * 3600) is computed in 32 bit precision and can overflow (for somewhat pathological, but still permissible, values of the tunable) before being widened by the multiplication by the (64 bit) from_hz.

Avoid result overflow due to intermediate product overflow. Algorithm was multiplying input value by target frequency before dividing it by source frequency. If target frequency was high (e.g. conversion to nanoseconds) it could easily lead to overflow even though final result would not overflow. Adjusting algorithm to avoid that. Note, that typically this code is resolved at compile time so it will not impact performance as long as it can be resolved. Signed-off-by: Krzysztof Chruscinski <[email protected]>

Add maximum timeout used for conversion to Kconfig. Option is used to determine which conversion algorithm to use: faster but overflowing earlier or slower without early overflow. Signed-off-by: Krzysztof Chruscinski <[email protected]>

andyross · 2022-01-18T01:02:24Z

Reup the +1 for the third time. Honestly I thought it was fine as originally submitted, but I like this version much better.

…ephyrproject-rtos/zephyr#41602

cfriedt · 2022-12-07T13:32:35Z

Next time, let's get a unit test as well.

Prior to zephyrproject-rtos#41602, due to the ordering of operations (first mul, then div), an intermediate value would overflow, resulting in a time non-linearity. This test ensures that time rolls-over properly. Signed-off-by: Chris Friedt <[email protected]>

Prior to #41602, due to the ordering of operations (first mul, then div), an intermediate value would overflow, resulting in a time non-linearity. This test ensures that time rolls-over properly. Signed-off-by: Chris Friedt <[email protected]>

Prior to #41602, due to the ordering of operations (first mul, then div), an intermediate value would overflow, resulting in a time non-linearity. This test ensures that time rolls-over properly. Signed-off-by: Chris Friedt <[email protected]> (cherry picked from commit 74c9c0e)

Prior to zephyrproject-rtos#41602, due to the ordering of operations (first mul, then div), an intermediate value would overflow, resulting in a time non-linearity. This test ensures that time rolls-over properly. Signed-off-by: Chris Friedt <[email protected]> (cherry picked from commit 74c9c0e)

Prior to #41602, due to the ordering of operations (first mul, then div), an intermediate value would overflow, resulting in a time non-linearity. This test ensures that time rolls-over properly. Signed-off-by: Chris Friedt <[email protected]> (cherry picked from commit 74c9c0e)

Prior to zephyrproject-rtos#41602, due to the ordering of operations (first mul, then div), an intermediate value would overflow, resulting in a time non-linearity. This test ensures that time rolls-over properly. Signed-off-by: Chris Friedt <[email protected]> (cherry picked from commit 74c9c0e)

nordic-krch requested review from MaureenHelm, carlescufi, galak and nashif as code owners January 5, 2022 14:11

github-actions bot added the area: API Changes to public APIs label Jan 5, 2022

nordic-krch force-pushed the fix_z_tmcvt branch from 3e382e3 to 946709b Compare January 5, 2022 14:18

zephyrbot added the area: Base OS Base OS Library (lib/os) label Jan 5, 2022

zephyrbot requested review from andyross and dcpleung January 5, 2022 14:22

zephyrbot assigned andyross Jan 5, 2022

andyross approved these changes Jan 11, 2022

View reviewed changes

github-actions bot added the area: Kernel label Jan 12, 2022

nordic-krch force-pushed the fix_z_tmcvt branch 2 times, most recently from b643210 to 78b59d6 Compare January 12, 2022 11:01

nordic-krch force-pushed the fix_z_tmcvt branch from 78b59d6 to 8f4c696 Compare January 13, 2022 08:53

github-actions bot added the area: Test Framework Issues related not to a particular test, but to the framework instead label Jan 13, 2022

nordic-krch force-pushed the fix_z_tmcvt branch from 8f4c696 to 5eb522f Compare January 14, 2022 07:46

nordic-krch mentioned this pull request Jan 14, 2022

lib: posix: clock: Prevent early overflows #41814

Merged

andyross reviewed Jan 15, 2022

View reviewed changes

nordic-krch added 2 commits January 17, 2022 18:56

nordic-krch force-pushed the fix_z_tmcvt branch from 5eb522f to 3fc4dd3 Compare January 17, 2022 17:56

dcpleung approved these changes Jan 18, 2022

View reviewed changes

nashif merged commit 50c7c7b into zephyrproject-rtos:main Jan 18, 2022

AndreyDodonov-EH added a commit to endresshauser-lp/sdk-zephyr that referenced this pull request Nov 22, 2022

Fix overflow when converting to nanoseconds�Part of the changes from z…

5cd6f42

…ephyrproject-rtos/zephyr#41602

AndreyDodonov-EH mentioned this pull request Nov 22, 2022

Hotfix/897 wrong timestamps endresshauser-lp/sdk-zephyr#10

Merged

cfriedt added the backport v2.7-branch label Dec 6, 2022

zephyrbot mentioned this pull request Dec 6, 2022

[Backport v2.7-branch] sys: time_units: Increase range of z_tmcvt #52832

Closed

cfriedt mentioned this pull request Dec 9, 2022

tests: time_units: check for overflow in z_tmcvt intermediate #52936

Merged

cfriedt mentioned this pull request Dec 12, 2022

posix: clock: current method of capturing elapsed time leads to loss in seconds #52975

Closed

sys: time_units: Increase range of z_tmcvt #41602

sys: time_units: Increase range of z_tmcvt #41602

Uh oh!

Conversation

nordic-krch commented Jan 5, 2022

Uh oh!

andyross left a comment

Choose a reason for hiding this comment

Uh oh!

nordic-krch commented Jan 11, 2022

Uh oh!

andyross commented Jan 11, 2022

Uh oh!

nordic-krch commented Jan 12, 2022

Uh oh!

andyross commented Jan 12, 2022

Uh oh!

nordic-krch commented Jan 13, 2022

Uh oh!

andyross commented Jan 13, 2022

Uh oh!

andyross commented Jan 13, 2022

Uh oh!

nordic-krch commented Jan 14, 2022

Uh oh!

andyross Jan 15, 2022

Choose a reason for hiding this comment

Uh oh!

nordic-krch Jan 17, 2022

Choose a reason for hiding this comment

Uh oh!

andyross commented Jan 18, 2022

Uh oh!

cfriedt commented Dec 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants