-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version: 5.2.2
Plugins installed: []
JVM version: OpenJDK 1.8.0_111
OS version: Ubuntu Linux 14.04
Description of the problem including expected versus actual behavior:
MapperService#parentTypes is rewrapped in an UnmodifiableSet in MapperService#internalMerge every time the cluster state is updated. After thousands of updates the collection is wrapped so deeply that calling a method on it generates a StackOverflowError.
I encountered this after upgrading from 5.1 to 5.2 (in < 5.2 parentTypes was only conditionally wrapped, it is now wrapped with every cluster state update). In my use case I have been creating an alias per user, and each alias creation results in a cluster state update and parentTypes is wrapped again. The error depth will depend on JVM configuration but in my production cluster the StackOverflow error occurs after about 30k cluster updates/new aliases (I had to use XX:MaxJavaStackTraceDepth to obtain the full stack trace since otherwise the trace was being truncated). In the worst case, if all the nodes in my cluster happened to be started at about the same time then they hit this StackOverflowError at the same time and the whole cluster goes down. Presumably prior to the actual error there is also some performance penalty from calls digging through the deeply nested collection object graph.
I don't really need to be using aliases so am going to factor them out of my use case, but do believe this is a bug, since the StackOverflowError manifests after adding a number of aliases that Elasticsearch should easily be able to handle (and has in past versions).
Steps to reproduce:
The following bash script reproduces this bug:
#!/bin/bash
set -e
# create an index we can alias
curl -s -X PUT localhost:9200/test_index >> /dev/null
# high enough N to generate a StackOverflowError
N=50000
for i in $(seq 1 $N); do
echo $i
# create an alias. we don't really care about the alias but creating
# an alias triggers a cluster state update which rewraps MapperService#parentTypes
curl -s -X PUT localhost:9200/test_index/_alias/test_alias_$i >> /dev/null
# putting a document will eventually cause a StackOverFlowError since it calls
# contains on the wrapped parentTypes collection via DocumentMapper#isParent
curl -s -H "Content-Type: application/json" -X PUT -d '{}' localhost:9200/test_index/test_type/1 >> /dev/null
doneProvide logs (if relevant):
[2017-03-13T15:44:21,290][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [search-01] fatal error in thread [elasticsearch[search-01][bulk][T#4]], exiting
java.lang.StackOverflowError: null
at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032) ~[?:1.8.0_111]
at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032) ~[?:1.8.0_111]
... thousands of identical calls here...
at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032) ~[?:1.8.0_111]
at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032) ~[?:1.8.0_111]
at org.elasticsearch.index.mapper.DocumentMapper.isParent(DocumentMapper.java:329) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.mapper.ParentFieldMapper.parseCreateField(ParentFieldMapper.java:233) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:287) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.mapper.ParentFieldMapper.postParse(ParentFieldMapper.java:228) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:97) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:66) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:275) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:533) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.shard.IndexShard.prepareIndexOnPrimary(IndexShard.java:510) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.index.TransportIndexAction.prepareIndexOperationOnPrimary(TransportIndexAction.java:196) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:201) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:348) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.index(TransportShardBulkAction.java:155) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.handleItem(TransportShardBulkAction.java:134) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.onPrimaryShard(TransportShardBulkAction.java:120) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.bulk.TransportShardBulkAction.onPrimaryShard(TransportShardBulkAction.java:73) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportWriteAction.shardOperationOnPrimary(TransportWriteAction.java:76) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportWriteAction.shardOperationOnPrimary(TransportWriteAction.java:49) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:914) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:884) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:327) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:262) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:864) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:861) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:147) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1652) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:873) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction.access$400(TransportReplicationAction.java:92) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:279) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:258) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:250) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:610) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:596) ~[elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.2.2.jar:5.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]