-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Running transform can increase load on the cluster. Especially in batch mode and for the first checkpoint of a continuous transform this can have bad side effects on the cluster.
Transform should support throttling similar to reindex. Throttling is configured in reindex with the setting requests_per_second.
Because transform works as a reducer (reindex is conceptually a mapper), transform should throttle differently:
- we be base the calculation on the number of input documents
- most runtime is spend for search/aggregations
- the number of output documents can differ, even within one transform due to data and configuration
For transform we decided to go for docs_per_second.
Setting docs_per_second to null disables throttling (instead of null you can also use -1), setting docs_per_second to 0 is disallowed.
Configuration:
{
"source": {...},
"dest": {...}
"pivot": {
"group_by": {...},
"aggregations"" {...}
},
"settings": {
"max_page_search_size": XX,
"docs_per_second": XX
}
}
docs_per_second is located under a new object: settings.
The setting max_page_search_size which is part of pivot will be moved there, too (the old place is deprecated but kept for BWC). If both are specified, the one under settings win, because it could be a leftover after updating an old configuration.
The whole settings object will be update-able with the update API.
Update will be
- immediate for settings
- delayed until the next checkpoint for the rest