-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
This is a question about casting from and to numpy. I asked a similar question for pandas here: pandas-dev/pandas#27211
The question is whether we can rely on having zero-copy wrapping and unwrapping of numpy arrays into DataArray, i.e. is it future proof to assume something like
import xarray as xr
import numpy as np
X = np.random.uniform(size=(10000, 10))
X_xr = xr.DataArray(X)
X_again = np.asarray(X_xr)
print(X.__array_interface__['data'][0] == X_again.__array_interface__['data'][0])
True
will always be true and no copy is happening?
Context: We want to attach some meta-data to our numpy arrays, in particular I'm interested in column names. Pandas is an obvious candidate for doing that, as we only have 2d array most of the time. However, pandas might change their internal structure so that we can't do zero copy wrapping and unwrapping any more.
Xarray is another candidate, even though it's a bit unnatural given that our data is usually 2d.
This is a design decision that's very hard to undo, so I want to make sure that it's reasonably future-proof if we want to consider using DataArray as a possible output format.