Skip to content

Nightly docker-CUDA (3.9, 1.11) fails due to apex installation failure #12365

@akihironitta

Description

@akihironitta

🐛 Bug

This is not blocking the current unavailability of GPU testing reported in #12314.

@Borda might be already aware of this, but creating this issue for tracking.

#13 5.357   Running setup.py install for apex: finished with status 'error'
#13 5.360 error: legacy-install-failure
#13 5.360 
#13 5.360 × Encountered error while trying to install package.
#13 5.360 ╰─> apex
#13 5.360 
#13 5.360 note: This is an issue with the package mentioned above, not pip.
#13 5.360 hint: See above for output from the failure.
------
dockers/base-cuda/Dockerfile:129
--------------------
 128 |     
 129 | >>> RUN \
 130 | >>>     # install NVIDIA apex
 131 | >>>     pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" https://github.com/NVIDIA/apex/archive/refs/heads/master.zip && \
 132 | >>>     python -c "from apex import amp"
 133 |     
--------------------
error: failed to solve: process "/bin/bash -c pip install -v --disable-pip-version-check --no-cache-dir --global-option=\"--cpp_ext\" --global-option=\"--cuda_ext\" https://github.com/NVIDIA/apex/archive/refs/heads/master.zip &&     python -c \"from apex import amp\"" did not complete successfully: exit code: 1
Error: buildx failed with: error: failed to solve: process "/bin/bash -c pip install -v --disable-pip-version-check --no-cache-dir --global-option=\"--cpp_ext\" --global-option=\"--cuda_ext\" https://github.com/NVIDIA/apex/archive/refs/heads/master.zip &&     python -c \"from apex import amp\"" did not complete successfully: exit code: 1

To Reproduce

See scheduled workflows failures:https://github.com/PyTorchLightning/pytorch-lightning/actions/workflows/events-nightly.yml
For example: https://github.com/PyTorchLightning/pytorch-lightning/runs/5562728423?check_suite_focus=true

Expected behavior

Environment

  • PyTorch Lightning Version (e.g., 1.5.0):
  • PyTorch Version (e.g., 1.10):
  • Python version (e.g., 3.9):
  • OS (e.g., Linux):
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • How you installed PyTorch (conda, pip, source):
  • If compiling from source, the output of torch.__config__.show():
  • Any other relevant information:

Additional context

cc @carmocca @akihironitta @Borda @tchaton @rohitgr7

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingciContinuous Integrationpriority: 1Medium priority task

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions