Skip to content

Conversation

@gwbrown
Copy link
Contributor

@gwbrown gwbrown commented Oct 29, 2018

Previously, if ClusterStateActionSteps or ClusterStateWaitSteps threw an
exception while executing, the exception would only be caught and logged by
the generic ClusterStateUpdateTask machinery and the index would become
stuck on that step.

Now, exceptions thrown in these steps will be caught and the index will
be moved to the Error step.

Credit to @talevy for the test cases, turns out we both worked on this separately and he got to them first.

Previously, if ClusterStateActionSteps or ClusterStateWaitSteps threw an
exception executing, the exception would only be caught and logged by
the generic ClusterStateUpdateTask machinery and the index would become
stuck on that step.

Now, exceptions thrown in these steps will be caught and the index will
be moved to the Error step.
@gwbrown gwbrown added the :Data Management/ILM+SLM Index and Snapshot lifecycle management label Oct 29, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Contributor

@talevy talevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments I think should be addressed

"policy [" + policy + "] for index [" + index.getName() + "] failed on step [" + startStep.getKey() + "].", e);
}

private ClusterState moveToErrorStep(final ClusterState state, Step.StepKey currentStepKey, Exception cause) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be RuntimeException here?

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a question

state = ((ClusterStateActionStep) currentStep).performAction(index, state);
try {
state = ((ClusterStateActionStep) currentStep).performAction(index, state);
} catch (RuntimeException exception) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we can't catch Exception here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over from an earlier draft of this that had more in the try block, and I wasn't sure if we wanted to include IOException in this catch. I've changed it.

}

private ClusterState moveToErrorStep(final ClusterState state, Step.StepKey currentStepKey, RuntimeException cause) throws IOException {
logger.error("policy [{}] for index [{}] failed on step [{}]. Moving to ERROR step", policy, index.getName(), currentStepKey);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/step/cluster state step/

@gwbrown gwbrown merged commit 6ecb8ff into elastic:index-lifecycle Oct 30, 2018
gwbrown added a commit that referenced this pull request Oct 30, 2018
Previously, if ClusterStateActionSteps or ClusterStateWaitSteps threw an
exception executing, the exception would only be caught and logged by
the generic ClusterStateUpdateTask machinery and the index would become
stuck on that step.

Now, exceptions thrown in these steps will be caught and the index will
be moved to the Error step.
@gwbrown gwbrown deleted the ilm/clusterstate-step-exceptions branch December 7, 2018 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocker >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants