Skip to content

Question: Guaranteed zero-copy round-trip from numpy? #3077

@amueller

Description

@amueller

This is a question about casting from and to numpy. I asked a similar question for pandas here: pandas-dev/pandas#27211

The question is whether we can rely on having zero-copy wrapping and unwrapping of numpy arrays into DataArray, i.e. is it future proof to assume something like

import xarray as xr
import numpy as np

X = np.random.uniform(size=(10000, 10))
X_xr = xr.DataArray(X)
X_again = np.asarray(X_xr)
print(X.__array_interface__['data'][0] == X_again.__array_interface__['data'][0])
True

will always be true and no copy is happening?

Context: We want to attach some meta-data to our numpy arrays, in particular I'm interested in column names. Pandas is an obvious candidate for doing that, as we only have 2d array most of the time. However, pandas might change their internal structure so that we can't do zero copy wrapping and unwrapping any more.

Xarray is another candidate, even though it's a bit unnatural given that our data is usually 2d.
This is a design decision that's very hard to undo, so I want to make sure that it's reasonably future-proof if we want to consider using DataArray as a possible output format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions