Clarify async search REST parameters

When playing with the new async search API I noticed a couple of inconsistencies and potential naming problems that I would like to discuss. Note that it's important to address these now as the API haven't been released yet and they are declared stable in our REST spec.

I asked @karmi for his input to validate my concerns and come up with some proposal. The following are the problems and the changes that we are proposing:

- `wait_for_completion`: it indicates how long you are willing to block and wait for results when submitting an async search, effectively turning async search to sync.
`wait_for_completion` is used in other API but with type `boolean`, while it is exposed as a `number`, effectively a timeout, in submit async search. This introduces inconsistency in our REST API, and it will cause issues for some of the language clients.
**Proposal**: rename it to `wait_for_results_timeout`: this way we include the timeout terminology and we don't reuse the existing `wait_for_completion`. Also `results` better explains what it is that users are waiting for compared to `completion`.

- `keep_alive`: it indicates how long the async search is available within the cluster. That means that when such timeout expires, the search will be stopped if still running or its results will be purged if it has already completed. 
The `keep_alive` naming comes from http terminology where it has to do with connections, while here the semantics is around how long state will be available/stored in the cluster, which could lead to misinterpreting what the parameter does.
**Proposal**: rename it to `keep_results_timeout`: this way we move away from reusing http terminology, and we make it clear that it's also a timeout around how long results will be available. Maybe what is not super clear about this is that the counting starts when the async search is submitted, not when it is completed. Suggestions are welcome.

- `clean_on_completion`: it indicates whether results should not be stored once they are returned within the above described timeout. 
There is some double negation in its description that makes it hard to understand it. Also, the notion of `completion` can be confusing as it's not about whether the search was completed but whether the results were returned within the provided (currently `wait_for_completion`) timeout. By default, results are not stored when they are returned directly by submit async search. Being it a `boolean` it may make users think that they can disable storing results at all times, but storing results can not be disabled, rightly so, when submit async search did not return them within the timeout. 
I considered removing this parameter, because when results have been returned, they could be stored externally. It turns out though that this parameter is useful to make testing deterministic and it makes sense to keep it.
**Proposal**: rename it to `keep_results` and make it an `enum` rather than a `boolean` with two possible values: `auto` (the default behaviour: store results unless submit async search returned them within `keep_results_timeout`) and `always` (store results for later retrieval even if they have been returned by submit async search within the provided timeout). I find that this better reflects the behaviour of the API and aligns well with the above proposed rename of `keep_alive` to `keep_results_timeout` as they are somehow related. Note that `always` does not mean forever, the results will always be cleaned when their validity expires.

- The rename of `wait_for_completion` to `wait_for_results_timeout` should also be applied to the get async search API, but maybe we should consider whether this parameter is useful when retrieving results? Users that are calling get async search are taking advantage of the async nature of async search, hence while I see why one would block and wait when submitting, I don't see why one would block and wait when retrieving results. Is avoiding an additional call when the search is almost complete a good enough reason to expose this parameter?





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarify async search REST parameters #54069

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarify async search REST parameters #54069

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions