Trainer.test() in combination with resume_from_checkpoint is broken

## 🐛 Bug

When passing `resume_from_checkpoint` to `Trainer`, and then training (e.g. call to `trainer.fit()`), the state used for `trainer.test()` is always the checkpoint initially given to `resume_from_checkpoint`, and never the newer, better one.

```
trainer = Trainer(resume_from_checkpoint="path_to_ckpt") # pass ckpt to Trainer for resuming
trainer.fit() # do some fine-tuning/resume training
trainer.test() # should make use of "best" checkpoint, however uses ckpt passed to resume_from_checkpoint
```



## Please reproduce using [the BoringModel and post here](https://colab.research.google.com/drive/1HvWVVTK8j2Nj52qU4Q4YCyzOm0_aLQF3?usp=sharing)

https://colab.research.google.com/drive/1ABXnUP10QUqHeUQmFy-FX26cV2w1JILA?usp=sharing

### Expected behavior

After fine-tuning, the best model state is looked up internally as introduced by #2190 before running on the test dataset.

### Environment

* CUDA:
	- GPU:
		- Tesla T4
	- available:         True
	- version:           10.1
* Packages:
	- numpy:             1.18.5
	- pyTorch_debug:     True
	- pyTorch_version:   1.7.0+cu101
	- pytorch-lightning: 1.1.0
	- tqdm:              4.41.1
* System:
	- OS:                Linux
	- architecture:
		- 64bit
		- 
	- processor:         x86_64
	- python:            3.6.9
	- version:           1 SMP Thu Jul 23 08:00:38 PDT 2020

### Additional context

A hotfix is to manually set `trainer.resume_from_checkpoint = None` between calls to `trainer.fit()` and `trainer.test()`.

```
trainer = Trainer(resume_from_checkpoint="path_to_ckpt") # pass ckpt to Trainer for resuming
trainer.fit()
trainer.resume_from_checkpoint = None
trainer.test()
```

The cause behind the issue is that `Trainer.test()` is performed internally by calling to [`Trainer.fit()`](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/trainer.py#L796) for all configurations.

Long term, the checkpoint passed by `resume_from_checkpoint` should most likely be consumed internally (i.e. reset to `None`) after the state is restored. Alternatively, one could make use of the [`Trainer.testing`](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/trainer.py#L793) attribute to limit the utilization of `Trainer.resume_from_checkpoint` by [`CheckpointConnector`](https://github.com/PyTorchLightning/pytorch-lightning/blob/12cb9942a18f9e9f6b7cdf6329abed0afdcd329e/pytorch_lightning/trainer/connectors/checkpoint_connector.py#L63-L64) to the training state only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainer.test() in combination with resume_from_checkpoint is broken #5091

🐛 Bug

Please reproduce using the BoringModel and post here

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trainer.test() in combination with resume_from_checkpoint is broken #5091

Description

🐛 Bug

Please reproduce using the BoringModel and post here

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions