-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version 6.3.2
Plugins installed: []
JVM version: 1.8.144
OS version: Linux 3.13.0-88-generic #135-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
In AbstractSearchAsyncAction::executeNextPhase there's the following code:
if (allowPartialResults == false && shardFailures.get() != null )
This code assumes that shardFailures.get() != null indicates shard failures.
However, since shard failures can be retried and then nulled out in AbstractSearchAsyncAction::onShardSuccess, it's possible that shardFailures.get() consists of only null ShardSearchFailures. When that happens, executeNextPhase fails with "Partial shards failure".
In addition, the status code in this case is 503.
This is our query configuration:
SearchRequest{searchType=QUERY_THEN_FETCH, indices=[index], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='_local', requestCache=null, scroll=null, maxConcurrentShardRequests=30, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=false
Steps to reproduce:
Provide logs (if relevant):
After a query fails (due to NPE in a custom java search script we use) with
org.elasticsearch.search.query.QueryPhaseExecutionException: Query Failed [Failed to execute main query]
and the query is retried on a different node and succeeds, the following appears in the log:
[2019-04-02T09:30:24,986][TRACE][o.e.a.s.TransportSearchAction] [esrec11d-10001-prod-nydc1.nydc1] got first-phase result from [t7GpBRj0TUCXKaYYwiMJVA][index][1]
[2019-04-02T09:30:24,990][TRACE][o.e.a.s.TransportSearchAction] [esrec11d-10001-prod-nydc1.nydc1] got first-phase result from [QBZwuD5MSLyDMD56SoZpWg][index][0]
[2019-04-02T09:30:24,990][DEBUG][o.e.a.s.TransportSearchAction] [esrec11d-10001-prod-nydc1.nydc1] 0 shards failed for phase: [query]