-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version (bin/elasticsearch --version): 7.3.0 and 8.0.0
Plugins installed: ['repository-gcs']
JVM version (java -version): openjdk 11.0.2
OS version (uname -a if on a Unix-like system): Darwin Kernel Version 18.6.0
Description of the problem including expected versus actual behavior:
From Slack discussions:
[Ben Trent]
Getting an NPE in Lucene.
The best we can tell it was added here: 63ee61d...cb527c2What we are doing in data-frame is running a DBQ terms query against the
.data-frame-internal-1. Anytime that DBQ is running, a new doc could be added to the index with that same term.Unsure if this same thing happens in 7.3 or 7.2. We are still testing
[Henning Andersen]
An educated guess is that it is the prune ID merge policy change that changed this. If you can reproduce this locally, it is easy to disable this atInternalEngine:2180, simply replacenew PrunePostingsMergePolicy(mergePolicy, IdFieldMapper.NAME)withmergePolicy. Curious to know the result...
The problems is likely the slice on_id, I think a null check is missing inTermsSliceQuerysince we risk a segment with no_idnow (but this is way beyond my knowledge)...
Based on investigations so far, it looks like this issue is caused by #40741.
Steps to reproduce:
- Create a data frame and start it
- Send a stop request while the data frame is finishing
- Try to delete the data frame
- Notice that the delete request fails with a 500, and the reason is an NPE.
Provide logs (if relevant):
Delete data frame API response
{
"phase": "query",
"failed_shards": [{
"node": "qSY9saZrRjm_7YULzSw-RQ",
"reason": {
"reason": null,
"type": "null_pointer_exception"
},
"index": ".data-frame-internal-1",
"shard": 0
}],
"reason": "all shards failed",
"grouped": true,
"caused_by": {
"reason": null,
"caused_by": {
"reason": null,
"type": "null_pointer_exception"
},
"type": "null_pointer_exception"
},
"suppressed": [{
"phase": "query",
"failed_shards": [{
"node": "qSY9saZrRjm_7YULzSw-RQ",
"reason": {
"reason": null,
"type": "null_pointer_exception"
},
"index": ".data-frame-internal-1",
"shard": 0
}],
"reason": "all shards failed",
"grouped": true,
"caused_by": {
"reason": null,
"caused_by": {
"reason": null,
"type": "null_pointer_exception"
},
"type": "null_pointer_exception"
},
"type": "search_phase_execution_exception"
}, {
"phase": "query",
"failed_shards": [{
"node": "qSY9saZrRjm_7YULzSw-RQ",
"reason": {
"reason": null,
"type": "null_pointer_exception"
},
"index": ".data-frame-internal-1",
"shard": 0
}],
"reason": "all shards failed",
"grouped": true,
"caused_by": {
"reason": null,
"caused_by": {
"reason": null,
"type": "null_pointer_exception"
},
"type": "null_pointer_exception"
},
"type": "search_phase_execution_exception"
}, {
"phase": "query",
"failed_shards": [{
"node": "qSY9saZrRjm_7YULzSw-RQ",
"reason": {
"reason": null,
"type": "null_pointer_exception"
},
"index": ".data-frame-internal-1",
"shard": 0
}],
"reason": "all shards failed",
"grouped": true,
"caused_by": {
"reason": null,
"caused_by": {
"reason": null,
"type": "null_pointer_exception"
},
"type": "null_pointer_exception"
},
"type": "search_phase_execution_exception"
}, {
"phase": "query",
"failed_shards": [{
"node": "qSY9saZrRjm_7YULzSw-RQ",
"reason": {
"reason": null,
"type": "null_pointer_exception"
},
"index": ".data-frame-internal-1",
"shard": 0
}],
"reason": "all shards failed",
"grouped": true,
"caused_by": {
"reason": null,
"caused_by": {
"reason": null,
"type": "null_pointer_exception"
},
"type": "null_pointer_exception"
},
"type": "search_phase_execution_exception"
}],
"type": "search_phase_execution_exception",
"root_cause": [{
"reason": null,
"type": "null_pointer_exception"
}]
}ES Logs
...
[2019-06-07T16:10:56,041][TRACE][o.e.x.d.t.p.SchemaUtil ] [dolaru-elastic-mbp.local] Extracted type for [responsetime] : [float] from index [farequote]
[2019-06-07T16:10:56,041][TRACE][o.e.x.d.t.p.SchemaUtil ] [dolaru-elastic-mbp.local] Extracted type for [airline] : [keyword] from index [farequote]
[2019-06-07T16:10:56,041][DEBUG][o.e.x.d.t.p.SchemaUtil ] [dolaru-elastic-mbp.local] Deduced mapping for: [avg_responsetime], agg type [avg] to [double]
[2019-06-07T16:10:56,041][DEBUG][o.e.x.d.t.p.SchemaUtil ] [dolaru-elastic-mbp.local] Deduced mapping for: [airline] to [keyword]
[2019-06-07T16:10:56,054][INFO ][o.e.c.m.MetaDataCreateIndexService] [dolaru-elastic-mbp.local] [farequote_dft_start] creating index, cause [api], templates [], shards [1]/[1], mappings [_doc]
[2019-06-07T16:10:56,055][INFO ][o.e.c.r.a.AllocationService] [dolaru-elastic-mbp.local] updating number_of_replicas to [0] for indices [farequote_dft_start]
[2019-06-07T16:10:56,276][TRACE][o.e.x.d.t.p.SchemaUtil ] [dolaru-elastic-mbp.local] Extracted type for [avg_responsetime] : [double] from index [farequote_dft_start]
[2019-06-07T16:10:56,276][TRACE][o.e.x.d.t.p.SchemaUtil ] [dolaru-elastic-mbp.local] Extracted type for [airline] : [keyword] from index [farequote_dft_start]
[2019-06-07T16:10:56,277][INFO ][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Updating state for data frame transform [test_dft_start] to [{"task_state":"started","indexer_state":"stopped","checkpoint":0}]
[2019-06-07T16:10:56,317][DEBUG][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Successfully updated state for data frame transform [test_dft_start] to [{"task_state":"started","indexer_state":"stopped","checkpoint":0}]
[2019-06-07T16:10:56,318][DEBUG][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Data frame indexer [data_frame/transforms/schedule_test_dft_start] schedule has triggered, state: [STARTED]
[2019-06-07T16:10:56,319][INFO ][o.e.x.d.t.DataFrameTransformPersistentTasksExecutor] [dolaru-elastic-mbp.local] Successfully completed and scheduled task in node operation
[2019-06-07T16:10:56,376][INFO ][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Finished indexing for data frame transform [test_dft_start] checkpoint [1]
[2019-06-07T16:10:56,379][DEBUG][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Updating persistent state of transform [test_dft_start] to [{"task_state":"started","indexer_state":"started","checkpoint":1,"progress":{"total_docs":86274,"docs_remaining":0,"percent_complete":100.0}}]
[2019-06-07T16:10:56,583][DEBUG][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Updating persistent state of transform [test_dft_start] to [{"task_state":"stopped","indexer_state":"stopped","checkpoint":1,"progress":{"total_docs":86274,"docs_remaining":0,"percent_complete":100.0}}]
[2019-06-07T16:10:56,639][INFO ][o.e.x.d.t.DataFrameTransformTask] [dolaru-elastic-mbp.local] Data frame transform [test_dft_start] indexer has stopped
[2019-06-07T16:10:56,740][DEBUG][o.e.a.s.TransportSearchAction] [dolaru-elastic-mbp.local] [.data-frame-internal-1][0], node[C0FEW7SlS7aMxaIxZygbuQ], [P], s[STARTED], a[id=YVgYHIXPSuC0IJDzsLuWcQ]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[.data-frame-internal-1], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false, ignore_throttled=true], routing='null', preference='null', requestCache=null, scroll=Scroll{keepAlive=5m}, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=true, localClusterAlias=null, getOrCreateAbsoluteStartMillis=-1, ccsMinimizeRoundtrips=true, source={"size":1000,"query":{"term":{"id":{"value":"test_dft_start","boost":1.0}}},"version":false,"seq_no_primary_term":true,"_source":false,"sort":[{"_doc":{"order":"asc"}}],"slice":{"field":"_id","id":1,"max":5}}}]
org.elasticsearch.transport.RemoteTransportException: [dolaru-elastic-mbp.local][10.4.20.60:9300][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException: Query Failed [Failed to execute main query]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:305) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:113) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:335) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:360) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.lambda$executeQueryPhase$1(SearchService.java:340) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener.lambda$map$2(ActionListener.java:145) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService$2.doRun(SearchService.java:1052) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:757) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: java.lang.NullPointerException
at org.elasticsearch.search.slice.TermsSliceQuery.build(TermsSliceQuery.java:81) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.slice.TermsSliceQuery$1.scorer(TermsSliceQuery.java:63) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.apache.lucene.search.Weight.scorerSupplier(Weight.java:147) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorerSupplier(LRUQueryCache.java:727) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.scorerSupplier(IndicesQueryCache.java:157) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:374) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:340) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.apache.lucene.search.Weight.bulkScorer(Weight.java:181) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:334) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:808) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:163) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.internal.ContextIndexSearcher$1.bulkScorer(ContextIndexSearcher.java:166) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:649) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.apache.lucene.search.XIndexSearcher.search(XIndexSearcher.java:44) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:177) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:270) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:113) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:335) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:360) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService.lambda$executeQueryPhase$1(SearchService.java:340) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener.lambda$map$2(ActionListener.java:145) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.search.SearchService$2.doRun(SearchService.java:1052) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:757) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:835) ~[?:?]
...