-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
🐛 Bug
Hello guys,
Does anybody have any idea why the early stopping was triggered? I checked the value of val_loss and it is not equal to 0. And I found the issue #490 #492 solved a similar problem, but that is pl 0.5, and the verison I used is 1.4.5, also I tried 1.5.8 before, the result is the same.
To Reproduce
def validation_step(self, batch, batch_idx):
print('batch size:', len(batch['pose_body']))
drec = self(batch['pose_body'].view(-1, 36))
loss = self._compute_loss(batch, drec)
print(loss)
val_loss = loss['unweighted_loss']['loss_total']
print('val_loss', val_loss)
#if self.renderer is not None and self.global_rank == 0 and batch_idx % 500==0 and np.random.rand()>0.5:
# out_fname = makepath(self.work_dir, 'renders/vald_rec_E{:03d}_It{:04d}_val_loss_{:.2f}.png'.format(self.current_epoch, batch_idx, val_loss.item()), isfile=True)
# self.renderer([batch, drec], out_fname = out_fname)
# dgen = self.vp_model.sample_poses(self.vp_ps.logging.num_bodies_to_display)
# out_fname = makepath(self.work_dir, 'renders/vald_gen_E{:03d}_I{:04d}.png'.format(self.current_epoch, batch_idx), isfile=True)
# self.renderer([dgen], out_fname = out_fname)
progress_bar = {'v2v': val_loss}
return {'val_loss': c2c(val_loss), 'progress_bar': progress_bar, 'log': progress_bar}
def validation_epoch_end(self, outputs):
metrics = {'val_loss': np.nanmean(np.concatenate([v['val_loss'] for v in outputs])) }
print('metrice:', metrics)
print('output:' , outputs)
if self.global_rank == 0:
self.text_logger('Epoch {}: {}'.format(self.current_epoch, ', '.join('{}:{:.2f}'.format(k, v) for k, v in metrics.items())))
self.text_logger('lr is {}'.format([pg['lr'] for opt in self.trainer.optimizers for pg in opt.param_groups]))
metrics = {k: torch.as_tensor(v) for k, v in metrics.items()}
progress_bar = {'val_loss': metrics['val_loss']}
return {'val_loss': metrics['val_loss'], 'progress_bar': progress_bar, 'log': `metrics}`
early_stopping:
monitor: val_loss
min_delta: 0.0
patience: 100
verbose: True
mode: min
>
### Environment
- PyTorch Lightning Version (e.g., 1.4.5):
- PyTorch Version (e.g., 1.7.1):
- Python version (e.g., 3.7):
- OS (e.g., Linux):
- CUDA/cuDNN version:10.1
- How you installed PyTorch (`conda`,):
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working