Skip to content

Conversation

@jimczi
Copy link
Contributor

@jimczi jimczi commented Mar 10, 2020

This change fixes a race condition in shard group failure callbacks and ensures that we set the correct flag on initial stored responses.

Relates #49931
Closes #53360

jimczi added 4 commits March 10, 2020 23:12
Shard group failure callbacks should be executed before incrementing
the total operations. This is required to ensure that we don't notify
a shard group failure **after** the completion callback.
This change ensures that we set the isRunning flag to `false`
when storing the initial response of an async search request.
@jimczi jimczi added >non-issue :Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI labels Mar 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left two questions

final String docId = searchTask.getSearchId().getDocId();
store.storeInitialResponse(docId, searchTask.getOriginHeaders(), searchResponse,
// creates the fallback response if the node crashes/restarts in the middle of the request
// TODO: store intermediate results ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate on this TODO? does it revolve around resiliency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, yes that's one of the follow up question we have in the meta issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so, I wonder if we need the TODO in the code then, cause we are tracking this anyways elsewhere.

@jimczi jimczi merged commit ab66529 into elastic:master Mar 11, 2020
@jimczi jimczi deleted the async_search_action_tests branch March 11, 2020 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] AsyncSearchActionTests fails unpredictably

3 participants