Skip to content

Conversation

@tomwhite
Copy link
Contributor

@tomwhite tomwhite commented Oct 5, 2018

These changes enable Scanpy's pre-processing functions to run on distributed engines including Dask and Spark. The Spark integration itself relies on Zap for a distributed version of NumPy.

The main change is the materialize_as_ndarray function that is used at certain points of the computation to materialize intermediate results (not the full matrix). This is a no-op in the non-distributed case.

@falexwolf
Copy link
Member

This is awesome! Already very elegant and a very good start! 😄

@falexwolf falexwolf merged commit 0a6de90 into scverse:master Oct 5, 2018
@tomwhite
Copy link
Contributor Author

tomwhite commented Oct 8, 2018

Thanks @falexwolf!

@flying-sheep flying-sheep mentioned this pull request Aug 4, 2023
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants