Skip to content

WandB dropping items when logging LR or val_loss with accumulate_grad_batches > 1 #5469

@tadejsv

Description

@tadejsv

🐛 Bug

As you can see in the BoringModel, I get the following warnings from WandB logger:

wandb: WARNING Step must only increase in log calls.  Step 49 < 98; dropping {'lr-SGD': 0.1}.
wandb: WARNING Step must only increase in log calls.  Step 99 < 198; dropping {'lr-SGD': 0.1}.
wandb: WARNING Step must only increase in log calls.  Step 149 < 199; dropping {'lr-SGD': 0.1}.
wandb: WARNING Step must only increase in log calls.  Step 149 < 298; dropping {'lr-SGD': 0.1}.
wandb: WARNING Step must only increase in log calls.  Step 156 < 299; dropping {'val_loss': 3.9880808209860966e-14, 'epoch': 0}.

This occurs when I add the following to the basic BordigModel:

  • log train loss with self.log()
  • add WandB logger
  • add accumulate_grad_batches > 1
  • either add LearningRateMonitor callback or log validation loss with self.log() (or both, as in the colab)

If any of these things is removed, the error doesn't occur.

The end result is that the LR metrics are note being logged at all. Worse than that, validation loss (and any other metrics that there would be!) do not get logged, making the logger useless.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedOpen to be worked onloggerRelated to the Loggerspriority: 1Medium priority taskwon't fixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions