change default indices.lifecycle.poll_interval to something sane #32521

talevy · 2018-07-31T22:31:24Z

This was originally set to a few seconds while prototyping things.
This interval is for the scheduled trigger of policies. Policies
have this extra trigger beyond just on cluster-state changes because
cluster-state changes may not be happeneing in a cluster for
whatever reason, and we need to continue making progress. Updating
this value to be larger is reasonable since not all operations
are expected to be completed in the span of seconds, but instead in
minutes and hours.

elasticmachine · 2018-07-31T22:31:26Z

Pinging @elastic/es-core-infra

dakrone · 2018-07-31T23:21:07Z

I take it this is the issue to kick off the discussion on what the interval should be? For what it's worth, I vote 5 minutes :)

talevy · 2018-07-31T23:28:36Z

@dakrone yup! let the voting begin! because I sure do not have a clue what the "right" interval is. Anywhere between 5min and 15min sounds reasonable to me. The 15min that was seeded in the PR was from one of my conversations with @colings86.

colings86 · 2018-08-01T08:36:35Z

@dakrone could you explain why you think 5 minutes is a good value here?

The poll interval is used for 2 purposes: 1) as a fallback to make sure we make progress in the event of a cluster change not happening or a client async listener not firing, 2) to trigger a check on the rollover conditions since an index breaking through these conditions does not cause a cluster state change

Reasons I think 15 minutes is a good value:

We expect typical deployments to take some time to perform actions so we don't want to fire too often since thats wasteful
15 minutes is a small time compared to the time we are likely to be in a phase anyway (we expect to be in a phase for days generally although the hot phase might be for less than this in some cases)
If we are expecting indexes to rollover in the order of a few days then checking for rollover conditions every fifteen minutes feels to me like we are being responsive without checking too much since the index will not have overflowed the criteria by much in fifteen minutes
Users who have very high throughput (mainly thinking of those currently on hourly indexes) can change the poll interval to fit their use case easily by updating the cluster setting

dakrone · 2018-08-02T20:26:48Z

@colings86 it's purely a gut feeling based on what I think a "medium but not long period of time" is.

It's like the definition of "a few", to me, "a few" to me is 3 to 8, and I'd like ILM to check every few minutes, so that led to the 5 minute interval.

This also includes a big caveat that I'll be completely happy if we go with 15 minutes, just wanted to explain my reasoning :)

talevy · 2018-08-02T20:37:19Z

Since many of these arguments feel like they could work with a variety of other minute values. I will play in the middle, two minutes above few, and an average of both suggestions... 10minutes?

colings86 · 2018-08-03T08:15:44Z

@dakrone thanks for explaining your reasoning. Personally I would be comfortable with anything from 5 minutes to 15 minutes. I think anything shorter than 5 minutes would be too often and anything long than 15 minutes might not be responsive enough for rollover. So I'm happy for the 10 minutes that @talevy proposed 😄

dakrone · 2018-08-03T14:10:49Z

10 minutes it is!

This was originally set to a few seconds while prototyping things. This interval is for the scheduled trigger of policies. Policies have this extra trigger beyond just on cluster-state changes because cluster-state changes may not be happeneing in a cluster for whatever reason, and we need to continue making progress. Updating this value to be larger is reasonable since not all operations are expected to be completed in the span of seconds, but instead in minutes and hours. 10 minutes is sane.

dakrone

LGTM

talevy · 2018-08-06T21:41:23Z

thanks @dakrone!

) This was originally set to a few seconds while prototyping things. This interval is for the scheduled trigger of policies. Policies have this extra trigger beyond just on cluster-state changes because cluster-state changes may not be happeneing in a cluster for whatever reason, and we need to continue making progress. Updating this value to be larger is reasonable since not all operations are expected to be completed in the span of seconds, but instead in minutes and hours. 10 minutes is sane.

talevy added >non-issue :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Jul 31, 2018

talevy requested review from colings86 and dakrone July 31, 2018 22:31

elasticmachine mentioned this pull request Jul 31, 2018

[meta] Index Lifecycle Management Plan #29823

Closed

talevy force-pushed the ilm-poll-interval branch from 18d8cf4 to d3a5e89 Compare August 3, 2018 18:57

talevy force-pushed the ilm-poll-interval branch from d3a5e89 to 22e2bb7 Compare August 6, 2018 16:35

talevy requested review from colings86 and dakrone and removed request for colings86 and dakrone August 6, 2018 21:22

dakrone approved these changes Aug 6, 2018

View reviewed changes

talevy merged commit 0ad252d into elastic:index-lifecycle Aug 6, 2018

talevy deleted the ilm-poll-interval branch August 6, 2018 21:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

change default indices.lifecycle.poll_interval to something sane #32521

change default indices.lifecycle.poll_interval to something sane #32521

Uh oh!

talevy commented Jul 31, 2018 •

edited

Loading

Uh oh!

elasticmachine commented Jul 31, 2018

Uh oh!

dakrone commented Jul 31, 2018

Uh oh!

talevy commented Jul 31, 2018

Uh oh!

colings86 commented Aug 1, 2018 •

edited

Loading

Uh oh!

dakrone commented Aug 2, 2018

Uh oh!

talevy commented Aug 2, 2018

Uh oh!

colings86 commented Aug 3, 2018

Uh oh!

dakrone commented Aug 3, 2018

Uh oh!

dakrone left a comment

Uh oh!

talevy commented Aug 6, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

change default indices.lifecycle.poll_interval to something sane #32521

change default indices.lifecycle.poll_interval to something sane #32521

Uh oh!

Conversation

talevy commented Jul 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Jul 31, 2018

Uh oh!

dakrone commented Jul 31, 2018

Uh oh!

talevy commented Jul 31, 2018

Uh oh!

colings86 commented Aug 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dakrone commented Aug 2, 2018

Uh oh!

talevy commented Aug 2, 2018

Uh oh!

colings86 commented Aug 3, 2018

Uh oh!

dakrone commented Aug 3, 2018

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

talevy commented Aug 6, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

talevy commented Jul 31, 2018 •

edited

Loading

colings86 commented Aug 1, 2018 •

edited

Loading