Skip to content

v1.5.0 breaks wandb hyperparameter sweeps in Colab #10336

@garrett361

Description

@garrett361

🐛 Bug

After upgrading to v1.5.0, wandb hyperparameter sweeps performed in Colab notebooks fail with one UserWarning and one ValueError raised:

UserWarning: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
ValueError('signal only works in main thread')

wandb sweep terminates without performing training runs. Downgrading to pl v1.4.9 resolves the issue.

To Reproduce

Reproduced at the end of this minimally modified BoringModel Colab notebook:
https://colab.research.google.com/drive/16nylTt8jGAbiSfq7zr7YlLOohfIbFqHq?usp=sharing

Expected behavior

wandb sweep should run training loops without terminating due to these errors.

Environment

  • CUDA:
    • GPU:
      • Tesla V100-SXM2-16GB
    • available: True
    • version: 11.1
  • Packages:
    • numpy: 1.19.5
    • pyTorch_debug: False
    • pyTorch_version: 1.9.0+cu111
    • pytorch-lightning: 1.5.0
    • tqdm: 4.62.3
  • System:
    • OS: Linux
    • architecture:
      • 64bit
    • processor: x86_64
    • python: 3.7.12
    • version: Proposal for help #1 SMP Sat Jun 5 09:50:34 PDT 2021
      You can also fill out the list below manually.
      -->
  • PyTorch Lightning Version (e.g., 1.3.0):
  • PyTorch Version (e.g., 1.8)
  • Python version:
  • OS (e.g., Linux):
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • How you installed PyTorch (conda, pip, source):
  • If compiling from source, the output of torch.__config__.show():
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinghelp wantedOpen to be worked onlogger: wandbWeights & Biasespriority: 0High priority task

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions