Skip to content

Initial GPU support #1967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
22a5807
Initial implementation of a GPU version of Buffer and NDBuffer
akshaysubr Jun 14, 2024
d8cc79f
Adding cupy as an optional dependency
akshaysubr Jun 14, 2024
4d2b8c7
Adding GPU prototype test
akshaysubr Jun 14, 2024
36b1cb2
Adding GPU memory store implementation
akshaysubr Jun 14, 2024
04001b4
Addressing comments
akshaysubr Jun 17, 2024
74a13c4
Making GpuMemoryStore tests conditional on cupy being available
akshaysubr Jun 17, 2024
bdc0a24
Adding test checking that existing host memory codecs use the gpu_buf…
akshaysubr Jun 18, 2024
d900aa3
Reducing code and docs duplication
akshaysubr Jun 28, 2024
0eca795
Formatting
akshaysubr Jun 28, 2024
d9ed6c4
Fixing silent rebase conflicts
akshaysubr Jun 28, 2024
5405e38
Reducing code duplication in GpuMemoryStore
akshaysubr Jun 28, 2024
2858701
Refactoring to an abstract Buffer class and concrete CPU and GPU impl…
akshaysubr Jul 8, 2024
4e18098
Templating store tests on Buffer type
akshaysubr Jul 8, 2024
35948d4
Changing imports to prevent circular dependencies
akshaysubr Jul 8, 2024
bd2a20b
Fixing unsafe calls to Buffer abstract methods in metadata.py and gro…
akshaysubr Jul 15, 2024
828401f
Preventing calls to abstract classmethods of Buffer and NDBuffer
akshaysubr Jul 15, 2024
02a6e9d
Fixing some more unsafe usage of Buffer abstract class
akshaysubr Aug 9, 2024
ff40d3c
Initial testing with cirun based GPU CI
akshaysubr Aug 9, 2024
e5cfd2f
Reverting to basic ubuntu machine image on GCP
akshaysubr Aug 9, 2024
d473a3d
Switching to cuda image from the docker registry
akshaysubr Aug 9, 2024
2a2e399
Revert "Switching to cuda image from the docker registry"
akshaysubr Aug 9, 2024
b89ab9a
Revert "Reverting to basic ubuntu machine image on GCP"
akshaysubr Aug 9, 2024
c5a387d
Revert "Initial testing with cirun based GPU CI"
akshaysubr Aug 9, 2024
72d172d
Adding pytest mark for GPU tests
akshaysubr Aug 9, 2024
3db61bd
Updating GPU memory store test with gpu mark
akshaysubr Aug 9, 2024
425c3f8
Adding GPU workflow that only runs GPU tests
akshaysubr Aug 9, 2024
75b0ad7
First pass at fixing merge conflicts, still many changes needed
akshaysubr Aug 20, 2024
c8c7e6d
Formatting
akshaysubr Aug 21, 2024
25a67ca
Fixing mypy errors in buffer code
akshaysubr Aug 23, 2024
ce7f5e2
Merging again with v3
akshaysubr Aug 23, 2024
ac061d9
Fixing errors in test_buffer.py
akshaysubr Aug 23, 2024
523d8d5
Fixing errors in test_buffer.py
akshaysubr Aug 23, 2024
b559ee4
Fixing store test errors
akshaysubr Aug 23, 2024
26a74f4
Fixing stateful store test
akshaysubr Aug 23, 2024
7307833
Fixing config test
akshaysubr Aug 23, 2024
f6fddd9
Fixing group tests
akshaysubr Aug 23, 2024
2b1fe14
Fixing indexing tests
akshaysubr Aug 23, 2024
abd135f
Manually installing cupy in the GPU workflow
akshaysubr Aug 23, 2024
1db58e7
Ablating GPU test matrix and adding gpu optional dependencies to the …
akshaysubr Aug 24, 2024
296bd02
Adding some more logging to debug GPU test failures
akshaysubr Aug 26, 2024
b33c887
Adding GA step to install the CUDA toolkit
akshaysubr Aug 26, 2024
c894f60
Merging with v3
akshaysubr Aug 26, 2024
e0da0fb
Adding a separate gputest hatch environment to simplify GPU testing
akshaysubr Aug 27, 2024
07277af
Fixing error in cuda-toolkit step
akshaysubr Aug 28, 2024
6e49e85
Downgrading to CUDA 12.4.1 in cuda-toolkit GA
akshaysubr Aug 28, 2024
02c319c
Trying manual install of the CUDA toolkit
akshaysubr Aug 29, 2024
e82ddc1
Updating environment variables with CUDA installation
akshaysubr Aug 29, 2024
7854ce9
Removing PATH env and setting it only through GITHUB_PATH
akshaysubr Aug 29, 2024
9688ad6
Merge branch 'v3' into gpu-buffer-implementation
akshaysubr Aug 29, 2024
3852c9f
Fixing issue from merge conflict
akshaysubr Aug 29, 2024
2e8069c
Merge branch 'v3' into gpu-buffer-implementation
d-v-b Aug 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .github/workflows/gpu_test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: GPU Test V3

on:
push:
branches: [ v3 ]
pull_request:
branches: [ v3 ]
workflow_dispatch:

env:
LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
test:
name: py=${{ matrix.python-version }}, np=${{ matrix.numpy-version }}, deps=${{ matrix.dependency-set }}

runs-on: gpu-runner
strategy:
matrix:
python-version: ['3.11']
numpy-version: ['2.0']
dependency-set: ["minimal"]

steps:
- uses: actions/checkout@v4
# - name: cuda-toolkit
# uses: Jimver/[email protected]
# id: cuda-toolkit
# with:
# cuda: '12.4.1'
- name: Set up CUDA
run: |
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-6
echo "/usr/local/cuda/bin" >> $GITHUB_PATH
- name: GPU check
run: |
nvidia-smi
echo $PATH
echo $LD_LIBRARY_PATH
nvcc -V
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install Hatch and CuPy
run: |
python -m pip install --upgrade pip
pip install hatch
- name: Set Up Hatch Env
run: |
hatch env create gputest.py${{ matrix.python-version }}-${{ matrix.numpy-version }}-${{ matrix.dependency-set }}
hatch env run -e gputest.py${{ matrix.python-version }}-${{ matrix.numpy-version }}-${{ matrix.dependency-set }} list-env
- name: Run Tests
run: |
hatch env run --env gputest.py${{ matrix.python-version }}-${{ matrix.numpy-version }}-${{ matrix.dependency-set }} run-coverage
35 changes: 34 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,9 @@ jupyter = [
'ipytree>=0.2.2',
'ipywidgets>=8.0.0',
]
gpu = [
"cupy-cuda12x",
]
Comment on lines +77 to +79
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gpu = [
"cupy-cuda12x",
]
cuda12 = [
"cupy-cuda12x",
]

docs = [
'sphinx',
'sphinx-autobuild>=2021.3.14',
Expand Down Expand Up @@ -120,7 +123,7 @@ build.hooks.vcs.version-file = "src/zarr/_version.py"
[tool.hatch.envs.test]
dependencies = [
"numpy~={matrix:numpy}",
"universal_pathlib"
"universal_pathlib",
]
features = ["test", "extra"]

Expand All @@ -134,8 +137,34 @@ python = ["3.10", "3.11", "3.12"]
numpy = ["1.24", "1.26", "2.0"]
features = ["optional"]

[[tool.hatch.envs.test.matrix]]
python = ["3.10", "3.11", "3.12"]
numpy = ["1.24", "1.26", "2.0"]
features = ["gpu"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
features = ["gpu"]
features = ["cuda12"]


[tool.hatch.envs.test.scripts]
run-coverage = "pytest --cov-config=pyproject.toml --cov=pkg --cov=tests"
run-coverage-gpu = "pip install cupy-cuda12x && pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov=tests"
run = "run-coverage --no-cov"
run-verbose = "run-coverage --verbose"
run-mypy = "mypy src"
run-hypothesis = "pytest --hypothesis-profile ci tests/v3/test_properties.py tests/v3/test_store/test_stateful*"
list-env = "pip list"

[tool.hatch.envs.gputest]
dependencies = [
"numpy~={matrix:numpy}",
"universal_pathlib",
]
features = ["test", "extra", "gpu"]

[[tool.hatch.envs.gputest.matrix]]
python = ["3.10", "3.11", "3.12"]
numpy = ["1.24", "1.26", "2.0"]
version = ["minimal"]

[tool.hatch.envs.gputest.scripts]
run-coverage = "pytest -m gpu --cov-config=pyproject.toml --cov=pkg --cov=tests"
run = "run-coverage --no-cov"
run-verbose = "run-coverage --verbose"
run-mypy = "mypy src"
Expand Down Expand Up @@ -223,4 +252,8 @@ filterwarnings = [
"error:::zarr.*",
"ignore:PY_SSIZE_T_CLEAN will be required.*:DeprecationWarning",
"ignore:The loop argument is deprecated since Python 3.8.*:DeprecationWarning",
"ignore:Creating a zarr.buffer.gpu.*:UserWarning",
]
markers = [
"gpu: mark a test as requiring CuPy and GPU"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it is worth using cuda here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're thinking, @jakirkham, that then there could be a multiplicity of these.

]
3 changes: 2 additions & 1 deletion src/zarr/codecs/blosc.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@

from zarr.abc.codec import BytesBytesCodec
from zarr.core.array_spec import ArraySpec
from zarr.core.buffer import Buffer, as_numpy_array_wrapper
from zarr.core.buffer import Buffer
from zarr.core.buffer.cpu import as_numpy_array_wrapper
from zarr.core.common import JSON, parse_enum, parse_named_configuration, to_thread
from zarr.registry import register_codec

Expand Down
3 changes: 2 additions & 1 deletion src/zarr/codecs/gzip.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@

from zarr.abc.codec import BytesBytesCodec
from zarr.core.array_spec import ArraySpec
from zarr.core.buffer import Buffer, as_numpy_array_wrapper
from zarr.core.buffer import Buffer
from zarr.core.buffer.cpu import as_numpy_array_wrapper
from zarr.core.common import JSON, parse_named_configuration, to_thread
from zarr.registry import register_codec

Expand Down
3 changes: 2 additions & 1 deletion src/zarr/codecs/zstd.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@

from zarr.abc.codec import BytesBytesCodec
from zarr.core.array_spec import ArraySpec
from zarr.core.buffer import Buffer, as_numpy_array_wrapper
from zarr.core.buffer import Buffer
from zarr.core.buffer.cpu import as_numpy_array_wrapper
from zarr.core.common import JSON, parse_named_configuration, to_thread
from zarr.registry import register_codec

Expand Down
13 changes: 11 additions & 2 deletions src/zarr/core/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -512,15 +512,24 @@ async def _set_selection(

# check value shape
if np.isscalar(value):
value = np.asanyarray(value, dtype=self.metadata.dtype)
array_like = prototype.buffer.create_zero_length().as_array_like()
if isinstance(array_like, np._typing._SupportsArrayFunc):
# TODO: need to handle array types that don't support __array_function__
# like PyTorch and JAX
array_like_ = cast(np._typing._SupportsArrayFunc, array_like)
value = np.asanyarray(value, dtype=self.metadata.dtype, like=array_like_)
else:
if not hasattr(value, "shape"):
value = np.asarray(value, self.metadata.dtype)
# assert (
# value.shape == indexer.shape
# ), f"shape of value doesn't match indexer shape. Expected {indexer.shape}, got {value.shape}"
if not hasattr(value, "dtype") or value.dtype.name != self.metadata.dtype.name:
value = np.array(value, dtype=self.metadata.dtype, order="A")
if hasattr(value, "astype"):
# Handle things that are already NDArrayLike more efficiently
value = value.astype(dtype=self.metadata.dtype, order="A")
else:
value = np.array(value, dtype=self.metadata.dtype, order="A")
value = cast(NDArrayLike, value)
# We accept any ndarray like object from the user and convert it
# to a NDBuffer (or subclass). From this point onwards, we only pass
Expand Down
19 changes: 19 additions & 0 deletions src/zarr/core/buffer/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from zarr.core.buffer.core import (
ArrayLike,
Buffer,
BufferPrototype,
NDArrayLike,
NDBuffer,
default_buffer_prototype,
)
from zarr.core.buffer.cpu import numpy_buffer_prototype

__all__ = [
"ArrayLike",
"Buffer",
"NDArrayLike",
"NDBuffer",
"BufferPrototype",
"default_buffer_prototype",
"numpy_buffer_prototype",
]
Loading