-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
When calling _stop against a running transform, there should be an option to wait_for_checkpoint. This will allow for better data consistency.
The overall requirements for this flag should probably be as follows:
wait_for_checkpoint should be its own flag, separate from force and wait_for_completion.
wait_for_completion: can be used with any other flag and indicates if we wait for the task to go away before returning to the listener or not. Essentially sync vs async.force: has to be used if the task is failedwait_for_checkpoint: cannot be used on a failed task since the indexer cannot continue, this flag makes no sense for a failed task. Its value should just be ignored on a failed task.
As for the default value for each of them, I think the following makes sense:
force: false
wait_for_completion: false
wait_for_checkpoint: true
This means that if a user wants to stop a checkpoint, but has noticed that it has stayed int he STOPPING state for a long time, they can use _stop?wait_for_checkpoint=false to cause it to stop.
This will most likely require a new DataFrameTransformTaskState state of STOPPING so that the transform can signal a stop when ClientDataFrameIndexer#onFinish is called.