Skip to content

Queries using allowPartialSearchResults=false involving only successful retries fail with status 503 #40743

@amirhadadi

Description

@amirhadadi

Elasticsearch version 6.3.2

Plugins installed: []

JVM version: 1.8.144

OS version: Linux 3.13.0-88-generic #135-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux

In AbstractSearchAsyncAction::executeNextPhase there's the following code:
if (allowPartialResults == false && shardFailures.get() != null )

This code assumes that shardFailures.get() != null indicates shard failures.
However, since shard failures can be retried and then nulled out in AbstractSearchAsyncAction::onShardSuccess, it's possible that shardFailures.get() consists of only null ShardSearchFailures. When that happens, executeNextPhase fails with "Partial shards failure".
In addition, the status code in this case is 503.

This is our query configuration:

SearchRequest{searchType=QUERY_THEN_FETCH, indices=[index], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='_local', requestCache=null, scroll=null, maxConcurrentShardRequests=30, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=false

Steps to reproduce:

Provide logs (if relevant):
After a query fails (due to NPE in a custom java search script we use) with
org.elasticsearch.search.query.QueryPhaseExecutionException: Query Failed [Failed to execute main query]
and the query is retried on a different node and succeeds, the following appears in the log:

[2019-04-02T09:30:24,986][TRACE][o.e.a.s.TransportSearchAction] [esrec11d-10001-prod-nydc1.nydc1] got first-phase result from [t7GpBRj0TUCXKaYYwiMJVA][index][1]

[2019-04-02T09:30:24,990][TRACE][o.e.a.s.TransportSearchAction] [esrec11d-10001-prod-nydc1.nydc1] got first-phase result from [QBZwuD5MSLyDMD56SoZpWg][index][0]

[2019-04-02T09:30:24,990][DEBUG][o.e.a.s.TransportSearchAction] [esrec11d-10001-prod-nydc1.nydc1] 0 shards failed for phase: [query]

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categories>bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions