Skip to content

Conversation

@dakrone
Copy link
Member

@dakrone dakrone commented Apr 1, 2019

Previously we only set the latch countdown with nextStep.setLatch after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
MockAsyncActionStep is being executed.

This moves the latch setting to be before the call to
runPolicyAfterStateChange, which means it is always available when the
MockAsyncActionStep is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves elastic#40018
@dakrone dakrone added >test Issues or PRs that are addressing/adding tests v7.0.0 :Data Management/ILM+SLM Index and Snapshot lifecycle management v8.0.0 v7.2.0 v6.6.3 v6.7.2 labels Apr 1, 2019
@dakrone dakrone requested a review from gwbrown April 1, 2019 21:18
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

Copy link
Contributor

@gwbrown gwbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice find!

@dakrone dakrone merged commit 9fa74a1 into elastic:master Apr 2, 2019
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
dakrone added a commit that referenced this pull request Apr 2, 2019
…40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves #40018
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 2, 2019
* master:
  add reason to DataFrameTransformState and add hlrc protocol tests (elastic#40736)
  Remove timezone validation on rollup range queries (elastic#40647)
  Fix testRunStateChangePolicyWithAsyncActionNextStep race condition (elastic#40707)
  Don't mark shard as refreshPending on stats fetching (elastic#40458)
  Name Snapshot Data Blobs by UUID (elastic#40652)
  SQL: [TEST] Mute TIME related failing tests
  [TEST] RecoveryWithConcurrentIndexing test (elastic#40733)
@colings86 colings86 added v6.7.1 and removed v6.7.2 labels Apr 3, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…lastic#40707)

Previously we only set the latch countdown with `nextStep.setLatch` after the
cluster state change has already been counted down. However, it's possible
execution could have already started, causing the latch to be missed when the
`MockAsyncActionStep` is being executed.

This moves the latch setting to be before the call to
`runPolicyAfterStateChange`, which means it is always available when the
`MockAsyncActionStep` is executed.

I was able to reproduce the failure every 30-40 runs before this change. With
this change, running 2000+ times the test passes.

Resolves elastic#40018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/ILM+SLM Index and Snapshot lifecycle management >test Issues or PRs that are addressing/adding tests v6.6.3 v6.7.1 v7.0.0-rc2 v7.2.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] IndexLifecycleRunnerTests.testRunStateChangePolicyWithAsyncActionNextStep failed

5 participants