Skip to content

Load ckpt_path when given to test/validate/predict #8347

@SeanNaren

Description

@SeanNaren

🚀 Feature

When the ckpt_path is passed to the test/validation/predict functions of the Trainer, they load the weights even if a model is provided.

Motivation

I noticed that one of our DeepSpeed test was incorrect (see here). resume_from_checkpoint does not re-load the weights for test/validate/predict, which is probably the right thing to do, however when modified to pass ckpt_path to the test function I noticed the weights are not loaded, which is default behaviour.

As described by @carmocca I suggested we change the behaviour as such:

BEFORE

trainer.test(model, ckpt_path=None) # use provided model
trainer.test(model, ckpt_path='best') # use provided model, ignore ckpt_path
trainer.test(model, ckpt_path='my_path') # use provided model, ignore ckpt_path

trainer.fit(model)
# then
trainer.test(ckpt_path=None) # use latest model
trainer.test(ckpt_path='my_path') # load path

AFTER

trainer.test(model, ckpt_path=None) # use provided model
trainer.test(model, ckpt_path='best') # load best model
trainer.test(model, ckpt_path='my_path') # load path

trainer.fit(model)
# then
trainer.test(ckpt_path=None) # load best model
trainer.test(ckpt_path='my_path') # load path

This imo makes the behaviour in line with what's expected + allows deepspeed to be used as an engine in the cases where inference cannot happen without the Trainer (when there is sharding orchestration etc).

Metadata

Metadata

Assignees

Labels

featureIs an improvement or enhancementhelp wantedOpen to be worked on

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions