Skip to content

to_netcdf with decoded time can create file with inconsistent time:units and time_bounds:units #2921

@klindsay28

Description

@klindsay28

Code Sample, a copy-pastable example if possible

import numpy as np
import xarray as xr

# create time and time_bounds DataArrays for Jan-1850 and Feb-1850
time_bounds_vals = np.array([[0.0, 31.0], [31.0, 59.0]])
time_vals = time_bounds_vals.mean(axis=1)

time_var = xr.DataArray(time_vals, dims='time',
                        coords={'time':time_vals})
time_bounds_var = xr.DataArray(time_bounds_vals, dims=('time', 'd2'),
                               coords={'time':time_vals})

# create Dataset of time and time_bounds
ds = xr.Dataset(coords={'time':time_var}, data_vars={'time_bounds':time_bounds_var})
ds.time.attrs = {'bounds':'time_bounds', 'calendar':'noleap',
                 'units':'days since 1850-01-01'} 

# write Jan-1850 values to file 
ds.isel(time=slice(0,1)).to_netcdf('Jan-1850.nc', unlimited_dims='time')

# write Feb-1850 values to file
ds.isel(time=slice(1,2)).to_netcdf('Feb-1850.nc', unlimited_dims='time')

# use open_mfdataset to read in files, combining into 1 Dataset
decode_times = True
decode_cf = True
ds = xr.open_mfdataset(['Jan-1850.nc', 'Feb-1850.nc'],
                       decode_cf=decode_cf, decode_times=decode_times)

# write combined Dataset out
ds.to_netcdf('JanFeb-1850.nc', unlimited_dims='time')

Problem description

The above code initially creates 2 netCDF files, for Jan-1850 and Feb-1850, that have the variables time and time_bounds, and time:bounds='time_bounds'. It then reads the 2 files back in as a single Dataset, using open_mfdataset, and this Dataset is written back out to a netCDF file. ncdump of this final file is

netcdf JanFeb-1850 {
dimensions:
	time = UNLIMITED ; // (2 currently)
	d2 = 2 ;
variables:
	int64 time(time) ;
		time:bounds = "time_bounds" ;
		time:units = "hours since 1850-01-16 12:00:00.000000" ;
		time:calendar = "noleap" ;
	double time_bounds(time, d2) ;
		time_bounds:_FillValue = NaN ;
		time_bounds:units = "days since 1850-01-01" ;
		time_bounds:calendar = "noleap" ;
data:

 time = 0, 708 ;

 time_bounds =
  0, 31,
  31, 59 ;
}

The problem is that the units attribute for time and time_bounds are different in this file, contrary to what CF conventions requires.

The final call to to_netcdf is creating a file where time's units (and type) differ from what they are in the intermediate files. These transformations are not being applied to time_bounds.

While the change to time's type is not necessarily an issue, I do find it surprising.

This inconsistency goes away if either of decode_times or decode_cf is set to False in the python code above. In particular, the transformations to time's units and type do not happen.

The inconsistency also goes away if open_mfdataset opens a single file. In this scenario also, the transformations to time's units and type do not happen.

I think that the desired behavior is to either not apply the units and type transformations to time, or to also apply them to time_bounds. The first option would be consistent with the current single-file behavior.

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2

xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 1.1.5
distributed: 1.26.1
matplotlib: 3.0.3
cartopy: None
seaborn: None
setuptools: 40.8.0
pip: 19.0.3
conda: None
pytest: 4.3.1
IPython: 7.4.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions