The time.perf_counter() function is implemented by calling QueryPerformanceCounter() and QueryPerformanceFrequency(). It computes QueryPerformanceCounter() * SEC_TO_SEC / QueryPerformanceFrequency() using int64_t integers. The problem is that SEC_TO_SEC is big: 10^9.
QueryPerformanceFrequency() usually returns 10 MHz on Windows 10 and newer. The fraction SEC_TO_NS / frequency = 1_000_000_000 / 10_000_000 can be simplified to 100 / 1.
I propose using a fraction internally to convert QueryPerformanceCounter() value to a number of seconds, and simplify the fraction using the Greatest Common Denominator (GCD).
There are multiple functions using a fraction:
_PyTime_GetClockWithInfo() for clock() -- time.process_time() in Python
_PyTime_GetProcessTimeWithInfo() for times() -- time.process_time() in Python
py_get_monotonic_clock() for mach_absolute_time() on macOS -- time.monotonic() in Python
py_get_win_perf_counter() for QueryPerformanceCounter() on Windows -- time.perf_counter() in Python
Linked PRs