-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingfeatureIs an improvement or enhancementIs an improvement or enhancementhelp wantedOpen to be worked onOpen to be worked onpriority: 1Medium priority taskMedium priority task
Description
🐛 Bug
ModelCheckpoint is unable to save filenames that reference a metric with a slash in their name. I use grouped metrics for tensorboard, and would like to save my files containing my loss: val/loss. However, ModelCheckpoint uses os.path.split, which splits the file name: https://github.com/PyTorchLightning/pytorch-lightning/blob/6ac0958166c66ed599c96737b587232b7a33d89e/pytorch_lightning/callbacks/model_checkpoint.py#L258
If I try to use
ModelCheckpoint("root/dir/{epoch}_{val/loss:.5f}")The above evaluates to
self.dirpath = "root/dir/{epoch}_{val"
self.filename = "loss:.5f}"This inevitably causes failure when attempting to format the output path.
To Reproduce
As above, log a metric with a slash, then use it in model checkpoint output
Code sample
class Module(pl.LightningModule):
...
def validation_step(self, batch, batch_idx):
x, y = batch
logits = self.forward(x)
loss = self.loss_fn(logits, y)
self.log('val/loss', loss, on_epoch=True)
return loss
...
def main():
trainer = pl.Trainer(checkpoint_callback=ModelCheckpoint("{epoch}_{val/loss:.5f}"))Expected behavior
Split only along file path boundaries, ignoring variable names yet-to-be-formatted.
Per the previous example, we'd expect:
self.dirpath = "root/dir"
self.filename = "{epoch}_{val/loss:.5f}"Environment
- CUDA:
- GPU:
- Tesla V100-SXM2-16GB
- Tesla V100-SXM2-16GB
- Tesla V100-SXM2-16GB
- Tesla V100-SXM2-16GB
- available: True
- version: 10.2
- GPU:
- Packages:
- numpy: 1.19.1
- pyTorch_debug: False
- pyTorch_version: 1.6.0
- pytorch-lightning: 0.10.0
- tqdm: 4.50.0
- System:
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.8.5
- version: Proposal for help #1 SMP Fri Sep 4 14:19:36 UTC 2020
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingfeatureIs an improvement or enhancementIs an improvement or enhancementhelp wantedOpen to be worked onOpen to be worked onpriority: 1Medium priority taskMedium priority task