Skip to content

Fitting hangs at "cleaning up ddp environment..." when tpu_cores=8 #6390

@KijitoraButi

Description

@KijitoraButi

🐛 Bug

When setting tpu_cores of Trainer to 8, fitting hangs at "cleaning up ddp environment...".

Please reproduce using the BoringModel

https://colab.research.google.com/drive/1tJswNaT0I-GrGsi6ngwwRDUmeFY1pFr3?usp=sharing

To Reproduce

Run above URL notebook.

Expected behavior

Trainer.fit ends normally .

Environment

  • PyTorch Version (e.g., 1.0): 1.7.0a0+7e71a98
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source): N/A
  • Python version: 3.7.10
  • CUDA/cuDNN version: N/A
  • GPU models and configuration: N/A
  • Any other relevant information: I tried the notebook on Google Colab with TPU.

Metadata

Metadata

Assignees

Labels

accelerator: tpuTensor Processing UnitbugSomething isn't workinghelp wantedOpen to be worked onpriority: 0High priority task

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions