-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
This is my example, that doesn't work since v2023.09.0
:
import numpy as np
import pandas as pd
import xarray as xr
time = pd.date_range("1970-01-01", "1970-01-31", freq="6h")
ds = xr.Dataset(coords=dict(time=time))
units = "days since 1960-01-01 00:00:00"
calendar = "gregorian"
encoding = dict(time=dict(units=units, calendar=calendar, dtype=np.dtype("float64")))
ds.to_netcdf("test.nc", encoding=encoding)
! ncdump -v time test.nc
This gives the following output:
netcdf test {
dimensions:
time = 121 ;
variables:
double time(time) ;
time:_FillValue = NaN ;
time:units = "days since 1960-01-01" ;
time:calendar = "gregorian" ;
data:
time = 3653, 3653, 3653, 3653, 3654, 3654, 3654, 3654, 3655, 3655, 3655,
3655, 3656, 3656, 3656, 3656, 3657, 3657, 3657, 3657, 3658, 3658, 3658,
3658, 3659, 3659, 3659, 3659, 3660, 3660, 3660, 3660, 3661, 3661, 3661,
3661, 3662, 3662, 3662, 3662, 3663, 3663, 3663, 3663, 3664, 3664, 3664,
3664, 3665, 3665, 3665, 3665, 3666, 3666, 3666, 3666, 3667, 3667, 3667,
3667, 3668, 3668, 3668, 3668, 3669, 3669, 3669, 3669, 3670, 3670, 3670,
3670, 3671, 3671, 3671, 3671, 3672, 3672, 3672, 3672, 3673, 3673, 3673,
3673, 3674, 3674, 3674, 3674, 3675, 3675, 3675, 3675, 3676, 3676, 3676,
3676, 3677, 3677, 3677, 3677, 3678, 3678, 3678, 3678, 3679, 3679, 3679,
3679, 3680, 3680, 3680, 3680, 3681, 3681, 3681, 3681, 3682, 3682, 3682,
3682, 3683 ;
}
It seems like the subdaily fraction is truncated. Note, that this does not happend, if i set the units to the start of the time range, e.g., units = "days since 1970-01-01 00:00:00". This results correctly in
netcdf test {
dimensions:
time = 121 ;
variables:
double time(time) ;
time:_FillValue = NaN ;
time:units = "days since 1970-01-01" ;
time:calendar = "gregorian" ;
data:
time = 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, 2.25, 2.5, 2.75, 3, 3.25,
3.5, 3.75, 4, 4.25, 4.5, 4.75, 5, 5.25, 5.5, 5.75, 6, 6.25, 6.5, 6.75, 7,
7.25, 7.5, 7.75, 8, 8.25, 8.5, 8.75, 9, 9.25, 9.5, 9.75, 10, 10.25, 10.5,
10.75, 11, 11.25, 11.5, 11.75, 12, 12.25, 12.5, 12.75, 13, 13.25, 13.5,
13.75, 14, 14.25, 14.5, 14.75, 15, 15.25, 15.5, 15.75, 16, 16.25, 16.5,
16.75, 17, 17.25, 17.5, 17.75, 18, 18.25, 18.5, 18.75, 19, 19.25, 19.5,
19.75, 20, 20.25, 20.5, 20.75, 21, 21.25, 21.5, 21.75, 22, 22.25, 22.5,
22.75, 23, 23.25, 23.5, 23.75, 24, 24.25, 24.5, 24.75, 25, 25.25, 25.5,
25.75, 26, 26.25, 26.5, 26.75, 27, 27.25, 27.5, 27.75, 28, 28.25, 28.5,
28.75, 29, 29.25, 29.5, 29.75, 30 ;
}
What did you expect to happen?
I expect subdaily frequencies to be encoded correctly also if the units startdate is different from the startdate of the time axis, e.g. v2023.08.0
correctly preserves fractions:
netcdf test {
dimensions:
time = 121 ;
variables:
double time(time) ;
time:_FillValue = NaN ;
time:units = "days since 1960-01-01" ;
time:calendar = "gregorian" ;
data:
time = 3653, 3653.25, 3653.5, 3653.75, 3654, 3654.25, 3654.5, 3654.75, 3655,
3655.25, 3655.5, 3655.75, 3656, 3656.25, 3656.5, 3656.75, 3657, 3657.25,
3657.5, 3657.75, 3658, 3658.25, 3658.5, 3658.75, 3659, 3659.25, 3659.5,
3659.75, 3660, 3660.25, 3660.5, 3660.75, 3661, 3661.25, 3661.5, 3661.75,
3662, 3662.25, 3662.5, 3662.75, 3663, 3663.25, 3663.5, 3663.75, 3664,
3664.25, 3664.5, 3664.75, 3665, 3665.25, 3665.5, 3665.75, 3666, 3666.25,
3666.5, 3666.75, 3667, 3667.25, 3667.5, 3667.75, 3668, 3668.25, 3668.5,
3668.75, 3669, 3669.25, 3669.5, 3669.75, 3670, 3670.25, 3670.5, 3670.75,
3671, 3671.25, 3671.5, 3671.75, 3672, 3672.25, 3672.5, 3672.75, 3673,
3673.25, 3673.5, 3673.75, 3674, 3674.25, 3674.5, 3674.75, 3675, 3675.25,
3675.5, 3675.75, 3676, 3676.25, 3676.5, 3676.75, 3677, 3677.25, 3677.5,
3677.75, 3678, 3678.25, 3678.5, 3678.75, 3679, 3679.25, 3679.5, 3679.75,
3680, 3680.25, 3680.5, 3680.75, 3681, 3681.25, 3681.5, 3681.75, 3682,
3682.25, 3682.5, 3682.75, 3683 ;
}
Minimal Complete Verifiable Example
import numpy as np
import pandas as pd
import xarray as xr
time = pd.date_range("1970-01-01", "1970-01-31", freq="6h")
ds = xr.Dataset(coords=dict(time=time))
units = "days since 1960-01-01 00:00:00"
calendar = "gregorian"
encoding = dict(time=dict(units=units, calendar=calendar, dtype=np.dtype("float64")))
ds.to_netcdf("test.nc", encoding=encoding)
! ncdump -v time test.nc
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
No response
Anything else we need to know?
This is still an issue in the current main.
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:39:40) [Clang 15.0.7 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.2
xarray: 2023.9.1.dev12+gd5f17858
pandas: 2.1.1
numpy: 1.24.4
scipy: 1.11.3
netCDF4: 1.6.4
pydap: installed
h5netcdf: 1.2.0
h5py: 3.9.0
Nio: None
zarr: 2.16.1
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: 3.2.2
iris: 3.7.0
bottleneck: 1.3.7
dask: 2023.9.3
distributed: 2023.9.3
matplotlib: 3.8.0
cartopy: 0.22.0
seaborn: 0.13.0
numbagg: 0.2.2
fsspec: 2023.9.2
cupy: None
pint: 0.20.1
sparse: 0.14.0
flox: 0.7.2
numpy_groupies: 0.10.2
setuptools: 68.2.2
pip: 23.2.1
conda: None
pytest: 7.4.2
mypy: None
IPython: 8.16.1
sphinx: None