posix: clock: fix seconds calculation #52977

cfriedt · 2022-12-12T14:27:54Z

The previous method used to calculate seconds in clock_gettime() seemed to have an inaccuracy that grew with time, causing the seconds to be off by an order of magnitude when ticks would roll over.

Edit: technically, this is likely a bug in z_tmcvt().

Fixes #52975

cfriedt · 2022-12-12T14:40:44Z

cc @AndreyDodonov-EH, @too1

cfriedt · 2022-12-12T17:41:27Z

woops - turned off automatic code formatting to reduce the scope of the diff, but then left a dangling {. It's fixed now.

andyross · 2022-12-16T15:45:02Z

lib/posix/clock.c

This doesn't look right. The conversion function should be precise and reduce to the division already on all systems where that's possible. The bug is in the second line I think, where it feeds the result back into the reverse conversion; that's going to lose bits in configurations where the tick rate isn't divisible by ms. Can you check that just the second line fixes the issue?

Also... I haven't pulled up the whole file to check, but I have a hard time believing that that nremainder value is anything but a footgun in an implementation of clock_gettime(). There shouldn't be any state held, and if we want higher precision values we should start there instead of back-converting from coarser units like seconds.

My notion of an idealized implementation here should be as simple as uint64_t ns = k_ticks_to_us_floor64(k_uptime_ticks()), and then just doing division/modulus ops to extract tv_nsec and tv_sec.

@andyross - with just the second line, we get the failure below, which is off by an order of magnitude (on qemu_x86_64). Perhaps that's an indicator that z_tmcvt() needs a testsuite (Edit: I added a minimal one under tests/unit/time_units).

START - test_clock_gettime_rollover CONFIG_SYS_CLOCK_TICKS_PER_SEC: 100 rollover_s: 184467440737095516 t-1: {18446744073709551, 596000000}

To me, straight division to get elapsed_secs makes sense.

The ticks value must always be >= elapsed_secs * CONFIG_SYS_CLOCK_TICKS_PER_SEC, so it looks like the nremainder calculation is a (potentially) cheaper way to get ticks % CONFIG_SYS_CLOCK_TICKS_PER_SEC.

Directly below, we use k_ticks_to_ns_floor32() to convert nremainder to nanoseconds, which seems ok to me.

Can you explain your issue in a bit more detail? And actually, this is the sort of thing that we should probably have a test for, so I'll implement a test for the scenario you outline if you can provide some additional detail.

IMHO, the routines in time_units.h should have the set of k_ticks_to_secs_*() functions. It's milli, micro, and nano, but not full seconds. I guess the thought was that conversion to seconds is simple enough, one should write out the division, as is done here. But as soon as you toss in rates determined at run time or non-integral clock speeds, it would be nice to not make everyone write their own code for it.

My notion of an idealized implementation here should be as simple as uint64_t ns = k_ticks_to_us_floor64(k_uptime_ticks()),

You meant _to_ns of course. This expression does overflow before a 64-bit cycle counter overflows, for clock rates < 1 GHz. It does take some 584 years to do this. So realistically, it ought to be good enough for any reasonable value of a timeout or time delta or timestamp.

However, @cfriedt's test looks at values around where the cycle counter overflows, which is large enough to trigger the overflow.

cfriedt · 2022-12-17T22:18:24Z

Rebased after boards: degu_evk: disable CDC ACM logging #53149

The previous method used to calculate seconds in `clock_gettime()` seemed to have an inaccuracy that grew with time causing the seconds to be off by an order of magnitude when ticks would roll over. This change fixes the method used to calculate seconds. Signed-off-by: Chris Friedt <[email protected]>

cfriedt · 2022-12-19T12:39:13Z

Rebased after boards: bl654_usb: disable CDC ACM logging #53177

cfriedt · 2022-12-20T15:00:43Z

@andyross - if there is a doubt in your mind, this should probably help. Almost looks like a Christmas tree, so Merry Christmas (if you celebrate)! 🎅 ❄️ ☃️ 🎄

generator.cpp.txt

That's with CONFIG_SYS_CLOCK_TICKS_PER_SEC=100. The error factor vanishes with CONFIG_SYS_CLOCK_TICKS_PER_SEC=1000 or CONFIG_SYS_CLOCK_TICKS_PER_SEC=10000. However, it is exacerbated when CONFIG_SYS_CLOCK_TICKS_PER_SEC=10.

So with ticks/s = 100, the error factor is 10, but when ticks/s = 10, the error factor is 100. There is no error when using the method in this PR.

I'm kind of concerned about where the order of magnitude difference comes from in the old method.

cc @AndreyDodonov-EH, @too1

Accurate timekeeping is something that is often taken for granted. However, reliability of timekeeping code is critical for most core and subsystem code. Furthermore, Many higher-level timekeeping utilities in Zephyr work off of ticks but there is no way to modify ticks directly which would require either unnecessary delays in test code or non-ideal compromises in test coverage. Since timekeeping is so critical, there should be as few barriers to testing timekeeping code as possible, while preserving integrity of the kernel's public interface. With this, we expose `sys_clock_tick_set()` as a system call only when `CONFIG_ZTEST` is set, declared within the ztest framework. Signed-off-by: Chris Friedt <[email protected]>

Add a test to verify previous fixes when calculating time in seconds for `clock_gettime()`. Signed-off-by: Chris Friedt <[email protected]>

cfriedt · 2022-12-26T12:21:42Z

use uint64_t instead of int64_t in sys_clock_tick_set() - no reason it should be signed

xyzzy42 · 2022-12-26T22:23:41Z

That's with CONFIG_SYS_CLOCK_TICKS_PER_SEC=100. The error factor vanishes with CONFIG_SYS_CLOCK_TICKS_PER_SEC=1000 or CONFIG_SYS_CLOCK_TICKS_PER_SEC=10000. However, it is exacerbated when CONFIG_SYS_CLOCK_TICKS_PER_SEC=10.

I don't get how those numbers came from CONFIG_SYS_CLOCK_TICKS_PER_SEC=100?

    const uint64_t rollover_s = UINT64_MAX / CONFIG_SYS_CLOCK_TICKS_PER_SEC;
    …
    curr_tick = rollover_s * CONFIG_SYS_CLOCK_TICKS_PER_SEC;
    cout << curr_tick << "\t" << method1() << "\t" << method2() << endl;

Shouldn't the final value of curr_tick be 18446744073709551600 and not 18446744073709500000?

Anyway, the existing code has an obvious bug if CONFIG_SYS_CLOCK_TICKS_PER_SEC is less than 1000. Converting to ms as an (unnecessary) intermediate step to seconds overflows, even though neither the tick count nor the time in seconds will overflow. This is because ms will be a larger number than ticks.

Another bug is in the test code. It finds the second in which the 64-bit tick counter overflows. But the result will be stored in a timespec. This is allowed to have a 32-bit signed seconds field. This will overflow before the tick counter unless CONFIG_SYS_CLOCK_TICKS_PER_SEC is > 2,000,000,000. In other words, rollover_s can overflow ts->tv_sec.

cfriedt · 2022-12-27T00:23:28Z

I don't get how those numbers came from CONFIG_SYS_CLOCK_TICKS_PER_SEC=100?
    const uint64_t rollover_s = UINT64_MAX / CONFIG_SYS_CLOCK_TICKS_PER_SEC;
    …
    curr_tick = rollover_s * CONFIG_SYS_CLOCK_TICKS_PER_SEC;
    cout << curr_tick << "\t" << method1() << "\t" << method2() << endl;
Shouldn't the final value of curr_tick be 18446744073709551600 and not 18446744073709500000?

That's only to get an arbitrary curr_tick less than rollover_s (with up to 1s of ticks to spare).

However, I realized that my for-loop boundary conditions were not in correct units. Still illustrates the issue in any case.

for(curr_tick = 0; curr_tick < rollover_s

Anyway, the existing code has an obvious bug if CONFIG_SYS_CLOCK_TICKS_PER_SEC is less than 1000. Converting to ms as an (unnecessary) intermediate step to seconds overflows, even though neither the tick count nor the time in seconds will overflow. This is because ms will be a larger number than ticks.

👍

Another bug is in the test code. It finds the second in which the 64-bit tick counter overflows. But the result will be stored in a timespec. This is allowed to have a 32-bit signed seconds field. This will overflow

Given that Zephyr will own the POSIX headers, that's a good point. We should enforce that it's 64-bit (a recommendation for the 2038 problem). It's not something that will be fixed here

https://pubs.opengroup.org/onlinepubs/9699919799/functions/time.html

Thanks for the review.

xyzzy42 · 2023-01-04T20:41:17Z

The overflow test doesn't respect CONFIG_SYS_CLOCK_MAX_TIMEOUT_DAYS. It assumes the 64-bit tick counter is the only possible overflow, but I don't think that is guaranteed.

cfriedt requested review from andyross, dcpleung, nashif and pfalcon as code owners December 12, 2022 14:27

zephyrbot added area: Kernel area: POSIX POSIX API Library labels Dec 12, 2022

zephyrbot requested review from KangJianX, ceolin, enjiamai and peter-mitsis December 12, 2022 14:28

zephyrbot assigned andyross Dec 12, 2022

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch 2 times, most recently from add4518 to 1e4c037 Compare December 12, 2022 14:38

cfriedt requested a review from nordic-krch December 12, 2022 14:40

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch 3 times, most recently from a449e12 to f49e7bc Compare December 12, 2022 17:40

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch from f49e7bc to ae2ddbc Compare December 15, 2022 19:00

zephyrbot added the area: Testsuite Testsuite label Dec 15, 2022

zephyrbot requested review from aaronemassey, asemjonovs, jeremybettis and yperess December 15, 2022 19:00

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch 2 times, most recently from 1976dc3 to 9fb9849 Compare December 15, 2022 19:15

cfriedt added the backport v2.7-branch label Dec 15, 2022

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch from 9fb9849 to 5a43db6 Compare December 16, 2022 05:03

andyross reviewed Dec 16, 2022

View reviewed changes

cfriedt mentioned this pull request Dec 16, 2022

usb: cdc_acm: log related warning promoted to error #53147

Closed

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch 3 times, most recently from 32c85b6 to fc4e2e0 Compare December 17, 2022 22:17

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch from fc4e2e0 to 4cad493 Compare December 17, 2022 22:42

cfriedt requested a review from andyross December 18, 2022 17:06

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch from 4cad493 to 53d9b05 Compare December 19, 2022 12:38

cfriedt mentioned this pull request Dec 19, 2022

tests: kernel: timer: starve: DTS failure stm32f3_seco_d23 #53194

Closed

cfriedt added bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug labels Dec 20, 2022

cfriedt mentioned this pull request Dec 20, 2022

posix: clock: current method of capturing elapsed time leads to loss in seconds #52975

Closed

Chris Friedt added 2 commits December 26, 2022 07:20

tests: posix: clock: ensure overflow is consistent

662f37d

Add a test to verify previous fixes when calculating time in seconds for `clock_gettime()`. Signed-off-by: Chris Friedt <[email protected]>

cfriedt force-pushed the add-test-for-clock-gettime-overflows branch from 53d9b05 to 662f37d Compare December 26, 2022 12:21

cfriedt mentioned this pull request Jan 3, 2023

[Backport v2.7-branch] lib: posix: clock: Prevent early overflows #52867

Closed

jeremybettis approved these changes Jan 3, 2023

View reviewed changes

peter-mitsis approved these changes Jan 4, 2023

View reviewed changes

carlescufi merged commit 8e4137f into zephyrproject-rtos:main Jan 4, 2023

zephyrbot mentioned this pull request Jan 4, 2023

[backport v2.7-branch] backport of #52977 failed #53503

Closed

cfriedt deleted the add-test-for-clock-gettime-overflows branch January 24, 2023 14:46

cfriedt mentioned this pull request Apr 7, 2023

Overflow in settime posix function #52825

Closed

posix: clock: fix seconds calculation #52977

posix: clock: fix seconds calculation #52977

Uh oh!

Conversation

cfriedt commented Dec 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfriedt commented Dec 12, 2022

Uh oh!

cfriedt commented Dec 12, 2022

Uh oh!

andyross Dec 16, 2022

Choose a reason for hiding this comment

Uh oh!

cfriedt Dec 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xyzzy42 Dec 26, 2022

Choose a reason for hiding this comment

Uh oh!

xyzzy42 Dec 26, 2022

Choose a reason for hiding this comment

Uh oh!

cfriedt commented Dec 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfriedt commented Dec 19, 2022

Uh oh!

cfriedt commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfriedt commented Dec 26, 2022

Uh oh!

xyzzy42 commented Dec 26, 2022

Uh oh!

cfriedt commented Dec 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xyzzy42 commented Jan 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

cfriedt commented Dec 12, 2022 •

edited

Loading

cfriedt Dec 17, 2022 •

edited

Loading

cfriedt commented Dec 17, 2022 •

edited

Loading

cfriedt commented Dec 20, 2022 •

edited

Loading

cfriedt commented Dec 27, 2022 •

edited

Loading