Skip to content

ModelCheckpoint does not save checkpoint on training end #8126

@GuillaumeTong

Description

@GuillaumeTong

🚀 Feature

See title

Motivation

When finishing training, either through keyboard interrupt, unexpected error, reaching the end of the intended training period, or any other means, it is very desirable to keep a checkpoint of the most recent state of our training.

Pitch

Imagine you need to interrupt the current training, but the last checkpoint was made hours ago, and you cannot wait for the next checkpoint to be saved in 3000 more steps. You need the system to drop a checkpoint for you when you stop the training.

Alternatives

Users could extend their own ModelCheckpoint to have an on_fit_end hook

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureIs an improvement or enhancementhelp wantedOpen to be worked onpriority: 2Low priority taskwaiting on authorWaiting on user action, correction, or update

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions