-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I tried to use apply_ufunc
for a function that takes input of unequal length and requires vectorize=True
which resulted in a ValueError
. I think the problem stems from the way np.vectorize
is called.
MCVE Code Sample
import xarray as xr
import scipy as sp
import scipy.stats
import numpy as np
# create dataarrays of unequal length
ds = xr.tutorial.open_dataset("air_temperature")
da1 = ds.air
da2 = ds.air.isel(time=slice(None, 50))
# function that takes arguments of unequal length and requires vectorizing
def mannwhitneyu(x, y):
_, p = sp.stats.mannwhitneyu(x, y)
return p
# test that the function takes arguments of unequal length
mannwhitneyu(da1.isel(lat=0, lon=0), da2.isel(lat=0, lon=0))
xr.apply_ufunc(
mannwhitneyu,
da1,
da2,
input_core_dims=[["time"], ["time"]],
exclude_dims=set(["time"]),
vectorize=True,
)
Returns
ValueError: inconsistent size for core dimension 'n': 50 vs 2920
Note: the error stems from numpy.
Expected Output
A DataArray.
Problem Description
I can reproduce the problem in pure numpy:
vec_wrong = np.vectorize(mannwhitneyu, signature="(n),(n)->()", otypes=[np.float])
vec_wrong(da1.values.T, da2.values.T)
The correct result is returned when the signature
is changed:
vec_correct = np.vectorize(mannwhitneyu, signature="(m),(n)->()", otypes=[np.float])
vec_correct(da1.values.T, da2.values.T)
So I think the signature needs to be changed when exclude_dims
are present.
Versions
Output of `xr.show_versions()`
This is my development environment, so i think xarray should be 'master'.
**PNC:/home/mathause/conda/envs/xarray_devel/lib/python3.7/site-packages/PseudoNetCDF/pncwarn.py:24:UserWarning:
pyproj could not be found, so IO/API coordinates cannot be converted to lat/lon; to fix, install pyproj or basemap (e.g., pip install pyproj)
INSTALLED VERSIONS
commit: None
python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-91-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.1
xarray: 0.11.1+335.gb0c336f6
pandas: 0.25.3
numpy: 1.17.3
scipy: 1.3.1
netCDF4: 1.5.3
pydap: installed
h5netcdf: 0.7.4
h5py: 2.10.0
Nio: None
zarr: 2.3.2
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: installed
rasterio: 1.1.0
cfgrib: 0.9.5.4
iris: None
bottleneck: 1.2.1
dask: 2.6.0
distributed: 2.6.0
matplotlib: 3.1.2
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 41.6.0.post20191101
pip: 19.3.1
conda: installed
pytest: 5.2.2
IPython: 7.9.0
sphinx: None