Skip to content

Using log_gpu_memory with MLFLow logger causes an exception. #4411

@dscarmo

Description

@dscarmo

🐛 Bug

Using log_gpu_memory with MLFLow logger causes an error. It appears the name of the metric is not supported by MLFLow.

MlflowException: Invalid metric name: 'gpu_id: 0/memory.used (MB)'. Names may only contain alphanumerics, underscores (_), dashes (-), periods (.), spaces ( ), and slashes (/).

To Reproduce

I reproduced the bug with the BoringModel, in the link bellow:
https://colab.research.google.com/drive/1P8uhSfjvYhKPMyRZH-QmfbOUOfnePy6G?usp=sharing

Expected behavior

log_gpu_memory should log gpu memory correctly when using an MLFlow logger.

Environment

Colab environment:

  • CUDA:
    • GPU:
      • Tesla T4
    • available: True
    • version: 10.1
  • Packages:
    • numpy: 1.18.5
    • pyTorch_debug: False
    • pyTorch_version: 1.6.0+cu101
    • pytorch-lightning: 1.0.3
    • tqdm: 4.41.1
  • System:
    • OS: Linux
    • architecture:
      • 64bit
    • processor: x86_64
    • python: 3.6.9
    • version: Proposal for help #1 SMP Thu Jul 23 08:00:38 PDT 2020

Metadata

Metadata

Assignees

No one assigned

    Labels

    3rd partyRelated to a 3rd-partybugSomething isn't workinghelp wantedOpen to be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions