Skip to content

time encoding fails for subdaily frequencies and days since #8271

@larsbuntemeyer

Description

@larsbuntemeyer

What happened?

This is my example, that doesn't work since v2023.09.0:

import numpy as np
import pandas as pd
import xarray as xr

time = pd.date_range("1970-01-01", "1970-01-31", freq="6h")
ds = xr.Dataset(coords=dict(time=time))

units = "days since 1960-01-01 00:00:00"
calendar = "gregorian"
encoding = dict(time=dict(units=units, calendar=calendar, dtype=np.dtype("float64")))

ds.to_netcdf("test.nc", encoding=encoding)

! ncdump -v time test.nc

This gives the following output:

netcdf test {
dimensions:
	time = 121 ;
variables:
	double time(time) ;
		time:_FillValue = NaN ;
		time:units = "days since 1960-01-01" ;
		time:calendar = "gregorian" ;
data:

 time = 3653, 3653, 3653, 3653, 3654, 3654, 3654, 3654, 3655, 3655, 3655, 
    3655, 3656, 3656, 3656, 3656, 3657, 3657, 3657, 3657, 3658, 3658, 3658, 
    3658, 3659, 3659, 3659, 3659, 3660, 3660, 3660, 3660, 3661, 3661, 3661, 
    3661, 3662, 3662, 3662, 3662, 3663, 3663, 3663, 3663, 3664, 3664, 3664, 
    3664, 3665, 3665, 3665, 3665, 3666, 3666, 3666, 3666, 3667, 3667, 3667, 
    3667, 3668, 3668, 3668, 3668, 3669, 3669, 3669, 3669, 3670, 3670, 3670, 
    3670, 3671, 3671, 3671, 3671, 3672, 3672, 3672, 3672, 3673, 3673, 3673, 
    3673, 3674, 3674, 3674, 3674, 3675, 3675, 3675, 3675, 3676, 3676, 3676, 
    3676, 3677, 3677, 3677, 3677, 3678, 3678, 3678, 3678, 3679, 3679, 3679, 
    3679, 3680, 3680, 3680, 3680, 3681, 3681, 3681, 3681, 3682, 3682, 3682, 
    3682, 3683 ;
}

It seems like the subdaily fraction is truncated. Note, that this does not happend, if i set the units to the start of the time range, e.g., units = "days since 1970-01-01 00:00:00". This results correctly in

netcdf test {
dimensions:
	time = 121 ;
variables:
	double time(time) ;
		time:_FillValue = NaN ;
		time:units = "days since 1970-01-01" ;
		time:calendar = "gregorian" ;
data:

 time = 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, 2.25, 2.5, 2.75, 3, 3.25, 
    3.5, 3.75, 4, 4.25, 4.5, 4.75, 5, 5.25, 5.5, 5.75, 6, 6.25, 6.5, 6.75, 7, 
    7.25, 7.5, 7.75, 8, 8.25, 8.5, 8.75, 9, 9.25, 9.5, 9.75, 10, 10.25, 10.5, 
    10.75, 11, 11.25, 11.5, 11.75, 12, 12.25, 12.5, 12.75, 13, 13.25, 13.5, 
    13.75, 14, 14.25, 14.5, 14.75, 15, 15.25, 15.5, 15.75, 16, 16.25, 16.5, 
    16.75, 17, 17.25, 17.5, 17.75, 18, 18.25, 18.5, 18.75, 19, 19.25, 19.5, 
    19.75, 20, 20.25, 20.5, 20.75, 21, 21.25, 21.5, 21.75, 22, 22.25, 22.5, 
    22.75, 23, 23.25, 23.5, 23.75, 24, 24.25, 24.5, 24.75, 25, 25.25, 25.5, 
    25.75, 26, 26.25, 26.5, 26.75, 27, 27.25, 27.5, 27.75, 28, 28.25, 28.5, 
    28.75, 29, 29.25, 29.5, 29.75, 30 ;
}

What did you expect to happen?

I expect subdaily frequencies to be encoded correctly also if the units startdate is different from the startdate of the time axis, e.g. v2023.08.0 correctly preserves fractions:

netcdf test {
dimensions:
	time = 121 ;
variables:
	double time(time) ;
		time:_FillValue = NaN ;
		time:units = "days since 1960-01-01" ;
		time:calendar = "gregorian" ;
data:

 time = 3653, 3653.25, 3653.5, 3653.75, 3654, 3654.25, 3654.5, 3654.75, 3655, 
    3655.25, 3655.5, 3655.75, 3656, 3656.25, 3656.5, 3656.75, 3657, 3657.25, 
    3657.5, 3657.75, 3658, 3658.25, 3658.5, 3658.75, 3659, 3659.25, 3659.5, 
    3659.75, 3660, 3660.25, 3660.5, 3660.75, 3661, 3661.25, 3661.5, 3661.75, 
    3662, 3662.25, 3662.5, 3662.75, 3663, 3663.25, 3663.5, 3663.75, 3664, 
    3664.25, 3664.5, 3664.75, 3665, 3665.25, 3665.5, 3665.75, 3666, 3666.25, 
    3666.5, 3666.75, 3667, 3667.25, 3667.5, 3667.75, 3668, 3668.25, 3668.5, 
    3668.75, 3669, 3669.25, 3669.5, 3669.75, 3670, 3670.25, 3670.5, 3670.75, 
    3671, 3671.25, 3671.5, 3671.75, 3672, 3672.25, 3672.5, 3672.75, 3673, 
    3673.25, 3673.5, 3673.75, 3674, 3674.25, 3674.5, 3674.75, 3675, 3675.25, 
    3675.5, 3675.75, 3676, 3676.25, 3676.5, 3676.75, 3677, 3677.25, 3677.5, 
    3677.75, 3678, 3678.25, 3678.5, 3678.75, 3679, 3679.25, 3679.5, 3679.75, 
    3680, 3680.25, 3680.5, 3680.75, 3681, 3681.25, 3681.5, 3681.75, 3682, 
    3682.25, 3682.5, 3682.75, 3683 ;
}

Minimal Complete Verifiable Example

import numpy as np
import pandas as pd
import xarray as xr

time = pd.date_range("1970-01-01", "1970-01-31", freq="6h")
ds = xr.Dataset(coords=dict(time=time))

units = "days since 1960-01-01 00:00:00"
calendar = "gregorian"
encoding = dict(time=dict(units=units, calendar=calendar, dtype=np.dtype("float64")))

ds.to_netcdf("test.nc", encoding=encoding)

! ncdump -v time test.nc

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

This is still an issue in the current main.

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:39:40) [Clang 15.0.7 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.2

xarray: 2023.9.1.dev12+gd5f17858
pandas: 2.1.1
numpy: 1.24.4
scipy: 1.11.3
netCDF4: 1.6.4
pydap: installed
h5netcdf: 1.2.0
h5py: 3.9.0
Nio: None
zarr: 2.16.1
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: 3.2.2
iris: 3.7.0
bottleneck: 1.3.7
dask: 2023.9.3
distributed: 2023.9.3
matplotlib: 3.8.0
cartopy: 0.22.0
seaborn: 0.13.0
numbagg: 0.2.2
fsspec: 2023.9.2
cupy: None
pint: 0.20.1
sparse: 0.14.0
flox: 0.7.2
numpy_groupies: 0.10.2
setuptools: 68.2.2
pip: 23.2.1
conda: None
pytest: 7.4.2
mypy: None
IPython: 8.16.1
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions