Skip to content

Conversation

@ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Nov 18, 2016

PR #19416 added a safety mechanism to shard state fetching to only access the store when the shard lock can be acquired. This can lead to the following situation however where a shard has not fully shut down yet while the shard fetching is going on, resulting in a ShardLockObtainFailedException. PrimaryShardAllocator that decides where to allocate primary shards sees this exception and treats the shard as unusable. If this is the only shard copy in the cluster, the cluster stays red and a new shard fetching cycle will not be triggered as shard state fetching treats exceptions while opening the store as permanent failures.

This PR makes it so that PrimaryShardAllocator treats the locked shard as a possible allocation target (although with the least priority).

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abeyad
Copy link

abeyad commented Nov 18, 2016

I wanted to make sure I understood correctly - if the shard that has the lock exception is the only valid copy, so the allocator decides to allocate the primary to this (currently) locked shard. When the node receives the cluster state update that it must allocate the primary on itself, it will try to obtain the shard lock for 5 secs. If it fails to obtain the lock within 5 secs, the failure is sent to master, which will try to reallocate to the same node again. It will do this for up to 5 tries (by default) due to the MaxRetryAllocationDecider. So the node must release the lock on the shard within 5 tries, each try attempting for 5 secs. Is this understanding correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question - why not let Store.tryOpenIndex throw ShardLockObtainFailedException and catch it here directly, logging it and not storing it as a "store" exception? The first part will make things simpler imo. I'm fine with not going with the second part and doing it like you propose but wanted to better understand your reasoning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can directly throw ShardLockObtainFailedException. I wasn't sure why you wrapped that exception in the first place so I left as is.

To your second point, I left the decision on how to treat the ShardLockObtainFailedException to PrimaryShardAllocator as it has more context available to make the final decision where to allocate the shard. For example, it prioritizes another valid shard copy that has not thrown the exception. Also it allows the shard store action /_shard_stores to properly expose the exception as it reuses the same endpoint as the primary shard allocator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can directly throw ShardLockObtainFailedException. I wasn't sure why you wrapped that exception in the first place so I left as is.

I think that I just didn't want to extend the scope of the change and I didn't have a reason to not just throw an IOException. I think the unwrapping here merits this?

For example, it prioritizes another valid shard copy that has not thrown the exception. Also it allows the shard store action /_shard_stores to properly expose the exception as it reuses the same endpoint as the primary shard allocator.

Fair enough. Thanks.

@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 18, 2016

@abeyad correct (I tried it to confirm). We will have 5 iterations where 5 seconds are taken to obtain the shard lock while shard fetching and then 5 seconds to obtain the shard lock while trying to allocate the shard on the node, so 5 * 5 seconds for shard fetching + 4 * 5 seconds for shard allocation attempts = 45 seconds :-)

Test code

        prepareCreate("test").setSettings(Settings.builder()
            .put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1)
            .put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, 0)).get();
        ensureGreen("test");

        ClusterState state = client().admin().cluster().prepareState().get().getState();
        ShardRouting shardRouting = state.routingTable().shardRoutingTable("test", 0).primaryShard();
        String nodeWithPrimary = shardRouting.currentNodeId();
        String node = state.nodes().get(nodeWithPrimary).getName();
        ShardId shardId = shardRouting.shardId();

        NodeEnvironment environment = internalCluster().getInstance(Node.class, node).getNodeEnvironment();
        IndicesService indicesService = internalCluster().getInstance(IndicesService.class, node);
        indicesService.getShardOrNull(shardId).failShard("because I can", new RuntimeException("because I can"));

        ShardLock shardLock = environment.shardLock(shardId, TimeValue.timeValueSeconds(5).millis());

        assertBusy(() -> {
            assertTrue(client().admin().cluster().prepareHealth("test").get().getStatus() == ClusterHealthStatus.RED);
        }, 1, TimeUnit.MINUTES);

        ensureGreen(TimeValue.timeValueMinutes(3), "test");
        shardLock.close();

Output:

/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/bin/java -ea -Didea.launcher.port=7532 "-Didea.launcher.bin.path=/Applications/IntelliJ IDEA.app/Contents/bin" -Didea.junit.sm_runner -Dfile.encoding=UTF-8 -classpath "/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar:/Applications/IntelliJ IDEA.app/Contents/plugins/junit/lib/junit-rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/lib/packager.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/lib/tools.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/jfxswt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Users/ywelsch/dev/elasticsearch/core/build-idea/classes/test:/Users/ywelsch/dev/elasticsearch/core/build-idea/classes/main:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/6.3.0/d3c87ea89e2f83e401f9cc7f14e4c43945f7f1e1/lucene-core-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/6.3.0/494aed699af238c3872a6b65e17939e9cb7ddbe0/lucene-analyzers-common-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-backward-codecs/6.3.0/77dede7dff1b833ca2e92d8ab137edb209354d9b/lucene-backward-codecs-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-grouping/6.3.0/2c96d59e318ea66838aeb9c5cfb8b4d27b40953c/lucene-grouping-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-highlighter/6.3.0/4f154d8badfe47fe45503c18fb30f2177f758794/lucene-highlighter-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-join/6.3.0/79b898117dcfde2981ec6806e420ff218842eca8/lucene-join-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-memory/6.3.0/89edeb404e507d640cb13903acff6953199704a2/lucene-memory-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-misc/6.3.0/2d0e1f5a9df15ac911ad495bad5ea253ab50a9f/lucene-misc-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queries/6.3.0/eb7938233c8103223069c7b5b5f785b4d20ddafa/lucene-queries-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/6.3.0/e979fb02155cbe81a8d335d6dc41d2ef06be68b6/lucene-queryparser-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/6.3.0/257387c45c6fa2b77fd6931751f93fdcd798ced4/lucene-sandbox-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial/6.3.0/3cf5fe5402b5e34b240b73501c9e97a82428259e/lucene-spatial-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial-extras/6.3.0/1b77ef3740dc885c62d5966fbe9aea1199d344fb/lucene-spatial-extras-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial3d/6.3.0/aa94b4a8636b3633008640cc5155ad354aebcea5/lucene-spatial3d-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-suggest/6.3.0/ed5d8ee5cd7edcad5d4ffca2b4540ccc844e9bb0/lucene-suggest-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.elasticsearch/securesm/1.1/1e423447d020041534be94c0f31a49fbdc1f2950/securesm-1.1.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/net.sf.jopt-simple/jopt-simple/5.0.2/98cafc6081d5632b61be2c9e60650b64ddbc637c/jopt-simple-5.0.2.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.carrotsearch/hppc/0.7.1/8b5057f74ea378c0150a1860874a3ebdcb713767/hppc-0.7.1.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/joda-time/joda-time/2.9.5/5f01da7306363fad2028b916f3eab926262de928/joda-time-2.9.5.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.yaml/snakeyaml/1.15/3b132bea69e8ee099f416044970997bde80f4ea6/snakeyaml-1.15.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.core/jackson-core/2.8.1/fd13b1c033741d48291315c6370f7d475a42dccf/jackson-core-2.8.1.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-smile/2.8.1/5b73867bc12224946fc67fc8d49d9f5e698d7f/jackson-dataformat-smile-2.8.1.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.8.1/eb63166c723b0b4b9fb5298fca232a2f6612ec34/jackson-dataformat-yaml-2.8.1.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-cbor/2.8.1/3a6fb7e75c9972559a78cf5cfc5a48a41a13ea40/jackson-dataformat-cbor-2.8.1.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.tdunning/t-digest/3.0/84ccf145ac2215e6bfa63baa3101c0af41017cfc/t-digest-3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.hdrhistogram/HdrHistogram/2.1.6/7495feb7f71ee124bd2a7e7d83590e296d71d80e/HdrHistogram-2.1.6.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.locationtech.spatial4j/spatial4j/0.6/21b15310bddcfd8c72611c180f20cf23279809a3/spatial4j-0.6.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.vividsolutions/jts/1.13/3ccfb9b60f04d71add996a666ceb8902904fd805/jts-1.13.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-api/2.7/8de00e382a817981b737be84cb8def687d392963/log4j-api-2.7.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-core/2.7/a3f2b4e64c61a7fc1ed8f1e5ba371933404ed98a/log4j-core-2.7.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-1.2-api/2.7/39f4e6c2d68d4ef8fd4b0883d165682dedd5be52/log4j-1.2-api-2.7.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/net.java.dev.jna/jna/4.2.2/5012450aee579c3118ff09461d5ce210e0cdc2a9/jna-4.2.2.jar:/Users/ywelsch/dev/elasticsearch/test/framework/build-idea/classes/main:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/com.carrotsearch.randomizedtesting/randomizedtesting-runner/2.4.0/222eb23dd6f45541acf6a5ac69cd9e9bdce25d2/randomizedtesting-runner-2.4.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/junit/junit/4.11/4e031bb61df09069aeb2bffb4019e7a5034a4ee0/junit-4.11.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest-all/1.3/63a21ebc981131004ad02e0434e799fd7f3a8d5a/hamcrest-all-1.3.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-test-framework/6.3.0/a6ad70bafbabbc82830f7e0b1d6ac1f4d74831d7/lucene-test-framework-6.3.0.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-codecs/6.3.0/8e58e160a4751200987e60a365f4370d88fd9942/lucene-codecs-6.3.0.jar:/Users/ywelsch/dev/elasticsearch/client/rest/build-idea/classes/main:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpclient/4.5.2/733db77aa8d9b2d68015189df76ab06304406e50/httpclient-4.5.2.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore/4.4.5/e7501a1b34325abb00d17dde96150604a0658b54/httpcore-4.4.5.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/commons-logging/commons-logging/1.1.3/f6f66e966c70a83ffbdb6f17a0919eaf7c8aca7f/commons-logging-1.1.3.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/commons-codec/commons-codec/1.10/4b95f4897fa13f2cd904aee711aeafc0c5295cd8/commons-codec-1.10.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.elasticsearch/securemock/1.2/98201d4ad5ac93f6b415ae9172d52b5e7cda490e/securemock-1.2.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpasyncclient/4.1.2/95aa3e6fb520191a0970a73cf09f62948ee614be/httpasyncclient-4.1.2.jar:/Users/ywelsch/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore-nio/4.4.5/f4be009e7505f6ceddf21e7960c759f413f15056/httpcore-nio-4.4.5.jar" com.intellij.rt.execution.application.AppMain com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 org.elasticsearch.action.admin.indices.create.CreateIndexIT,testWeirdScenario
[2016-11-18T20:30:44,217][WARN ][o.e.b.JNANatives         ] Unable to lock JVM Memory: error=78, reason=Function not implemented
[2016-11-18T20:30:44,223][WARN ][o.e.b.JNANatives         ] This can result in part of the JVM being swapped out.
[2016-11-18T21:30:45,827][INFO ][o.e.a.a.i.c.CreateIndexIT] [CreateIndexIT#testWeirdScenario]: setup test
[2016-11-18T21:30:45,847][INFO ][o.e.t.InternalTestCluster] Setup InternalTestCluster [TEST-CHILD_VM=[0]-CLUSTER_SEED=[167703776559476503]-HASH=[2E357FE8EAD77]-cluster] with seed [253CDB23D51AF17] using [0] dedicated masters, [2] (data) nodes and [1] coord only nodes
[2016-11-18T21:30:46,390][INFO ][o.e.n.Node               ] [node_t0] initializing ...
[2016-11-18T21:30:46,488][INFO ][o.e.e.NodeEnvironment    ] [node_t0] using [1] data paths, mounts [[/ (/dev/disk1)]], net usable_space [244.6gb], net total_space [464.7gb], spins? [unknown], types [hfs]
[2016-11-18T21:30:46,489][INFO ][o.e.e.NodeEnvironment    ] [node_t0] heap size [3.5gb], compressed ordinary object pointers [true]
[2016-11-18T21:30:46,491][INFO ][o.e.n.Node               ] [node_t0] version[6.0.0-alpha1-SNAPSHOT], pid[40959], build[Unknown/Unknown], OS[Mac OS X/10.12.1/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_60/25.60-b23]
[2016-11-18T21:30:46,492][WARN ][o.e.n.Node               ] [node_t0] version [6.0.0-alpha1-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
[2016-11-18T21:30:46,503][INFO ][o.e.p.PluginsService     ] [node_t0] no modules loaded
[2016-11-18T21:30:46,504][INFO ][o.e.p.PluginsService     ] [node_t0] loaded plugin [org.elasticsearch.test.ESIntegTestCase$TestSeedPlugin]
[2016-11-18T21:30:46,504][INFO ][o.e.p.PluginsService     ] [node_t0] loaded plugin [org.elasticsearch.test.discovery.TestZenDiscovery$TestPlugin]
[2016-11-18T21:30:46,504][INFO ][o.e.p.PluginsService     ] [node_t0] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
[2016-11-18T21:30:49,670][INFO ][o.e.n.Node               ] [node_t0] initialized
[2016-11-18T21:30:49,677][INFO ][o.e.n.Node               ] [node_t1] initializing ...
[2016-11-18T21:30:49,681][INFO ][o.e.e.NodeEnvironment    ] [node_t1] using [1] data paths, mounts [[/ (/dev/disk1)]], net usable_space [244.6gb], net total_space [464.7gb], spins? [unknown], types [hfs]
[2016-11-18T21:30:49,681][INFO ][o.e.e.NodeEnvironment    ] [node_t1] heap size [3.5gb], compressed ordinary object pointers [true]
[2016-11-18T21:30:49,681][INFO ][o.e.n.Node               ] [node_t1] version[6.0.0-alpha1-SNAPSHOT], pid[40959], build[Unknown/Unknown], OS[Mac OS X/10.12.1/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_60/25.60-b23]
[2016-11-18T21:30:49,681][WARN ][o.e.n.Node               ] [node_t1] version [6.0.0-alpha1-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
[2016-11-18T21:30:49,682][INFO ][o.e.p.PluginsService     ] [node_t1] no modules loaded
[2016-11-18T21:30:49,682][INFO ][o.e.p.PluginsService     ] [node_t1] loaded plugin [org.elasticsearch.test.ESIntegTestCase$TestSeedPlugin]
[2016-11-18T21:30:49,682][INFO ][o.e.p.PluginsService     ] [node_t1] loaded plugin [org.elasticsearch.test.discovery.TestZenDiscovery$TestPlugin]
[2016-11-18T21:30:49,682][INFO ][o.e.p.PluginsService     ] [node_t1] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
[2016-11-18T21:30:49,765][INFO ][o.e.n.Node               ] [node_t1] initialized
[2016-11-18T21:30:49,769][INFO ][o.e.n.Node               ] [node_tc2] initializing ...
[2016-11-18T21:30:49,777][INFO ][o.e.e.NodeEnvironment    ] [node_tc2] using [1] data paths, mounts [[/ (/dev/disk1)]], net usable_space [244.6gb], net total_space [464.7gb], spins? [unknown], types [hfs]
[2016-11-18T21:30:49,777][INFO ][o.e.e.NodeEnvironment    ] [node_tc2] heap size [3.5gb], compressed ordinary object pointers [true]
[2016-11-18T21:30:49,779][INFO ][o.e.n.Node               ] [node_tc2] version[6.0.0-alpha1-SNAPSHOT], pid[40959], build[Unknown/Unknown], OS[Mac OS X/10.12.1/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_60/25.60-b23]
[2016-11-18T21:30:49,779][WARN ][o.e.n.Node               ] [node_tc2] version [6.0.0-alpha1-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
[2016-11-18T21:30:49,780][INFO ][o.e.p.PluginsService     ] [node_tc2] no modules loaded
[2016-11-18T21:30:49,780][INFO ][o.e.p.PluginsService     ] [node_tc2] loaded plugin [org.elasticsearch.test.ESIntegTestCase$TestSeedPlugin]
[2016-11-18T21:30:49,781][INFO ][o.e.p.PluginsService     ] [node_tc2] loaded plugin [org.elasticsearch.test.discovery.TestZenDiscovery$TestPlugin]
[2016-11-18T21:30:49,781][INFO ][o.e.p.PluginsService     ] [node_tc2] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
[2016-11-18T21:30:49,880][INFO ][o.e.n.Node               ] [node_tc2] initialized
[2016-11-18T21:30:49,898][INFO ][o.e.n.Node               ] [node_t0] starting ...
[2016-11-18T21:30:49,966][INFO ][o.e.t.TransportService   ] [node_t0] publish_address {127.0.0.1:9400}, bound_addresses {[fe80::1]:9400}, {[::1]:9400}, {127.0.0.1:9400}
[2016-11-18T21:30:50,055][INFO ][o.e.n.Node               ] [node_t0] started
[2016-11-18T21:30:50,055][INFO ][o.e.n.Node               ] [node_t1] starting ...
[2016-11-18T21:30:50,066][INFO ][o.e.t.d.MockZenPing      ] [node_t0] pinging using mock zen ping
[2016-11-18T21:30:50,074][INFO ][o.e.t.d.MockZenPing      ] [node_t0] pinging using mock zen ping
[2016-11-18T21:30:50,088][INFO ][o.e.t.TransportService   ] [node_t1] publish_address {127.0.0.1:9401}, bound_addresses {[fe80::1]:9401}, {[::1]:9401}, {127.0.0.1:9401}
[2016-11-18T21:30:50,104][INFO ][o.e.n.Node               ] [node_t1] started
[2016-11-18T21:30:50,112][INFO ][o.e.t.d.MockZenPing      ] [node_t1] pinging using mock zen ping
[2016-11-18T21:30:50,114][INFO ][o.e.n.Node               ] [node_tc2] starting ...
[2016-11-18T21:30:50,186][INFO ][o.e.t.TransportService   ] [node_tc2] publish_address {127.0.0.1:9402}, bound_addresses {[fe80::1]:9402}, {[::1]:9402}, {127.0.0.1:9402}
[2016-11-18T21:30:50,191][INFO ][o.e.t.d.MockZenPing      ] [node_tc2] pinging using mock zen ping
[2016-11-18T21:30:50,243][INFO ][o.e.c.s.ClusterService   ] [node_t0] new_master {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}, added {{node_t1}{k-ggqJ2JRi-ql8PwqzzPBg}{xIGyXW7dSEWuXsHS1f8pbA}{127.0.0.1}{127.0.0.1:9401},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{node_t1}{k-ggqJ2JRi-ql8PwqzzPBg}{xIGyXW7dSEWuXsHS1f8pbA}{127.0.0.1}{127.0.0.1:9401}]
[2016-11-18T21:30:50,268][INFO ][o.e.c.s.ClusterService   ] [node_t1] detected_master {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}, added {{node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400},}, reason: zen-disco-receive(from master [master {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400} committed version [1]])
[2016-11-18T21:30:50,282][INFO ][o.e.c.s.ClusterService   ] [node_t0] added {{node_tc2}{gN9UaHrxRpS0T2AJ8hJuGQ}{wVYUDMB0TtCoR3gWIdWXcA}{127.0.0.1}{127.0.0.1:9402},}, reason: zen-disco-node-join[{node_tc2}{gN9UaHrxRpS0T2AJ8hJuGQ}{wVYUDMB0TtCoR3gWIdWXcA}{127.0.0.1}{127.0.0.1:9402}]
[2016-11-18T21:30:50,328][INFO ][o.e.c.s.ClusterService   ] [node_t1] added {{node_tc2}{gN9UaHrxRpS0T2AJ8hJuGQ}{wVYUDMB0TtCoR3gWIdWXcA}{127.0.0.1}{127.0.0.1:9402},}, reason: zen-disco-receive(from master [master {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400} committed version [2]])
[2016-11-18T21:30:50,328][INFO ][o.e.c.s.ClusterService   ] [node_tc2] detected_master {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}, added {{node_t1}{k-ggqJ2JRi-ql8PwqzzPBg}{xIGyXW7dSEWuXsHS1f8pbA}{127.0.0.1}{127.0.0.1:9401},{node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400},}, reason: zen-disco-receive(from master [master {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400} committed version [2]])
[2016-11-18T21:30:50,342][INFO ][o.e.n.Node               ] [node_tc2] started
[2016-11-18T21:30:50,362][INFO ][o.e.g.GatewayService     ] [node_t0] recovered [0] indices into cluster_state
[2016-11-18T21:30:50,361][INFO ][o.e.p.PluginsService     ] [transport_client_node_t1] no modules loaded
[2016-11-18T21:30:50,362][INFO ][o.e.p.PluginsService     ] [transport_client_node_t1] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
[2016-11-18T21:30:50,502][INFO ][o.e.a.a.i.c.CreateIndexIT] test using _default_ mappings: [{"_default_":{}}]
[2016-11-18T21:30:50,823][INFO ][o.e.a.a.i.c.CreateIndexIT] [CreateIndexIT#testWeirdScenario]: starting test
[2016-11-18T21:30:50,824][INFO ][o.e.p.PluginsService     ] [transport_client_node_t0] no modules loaded
[2016-11-18T21:30:50,824][INFO ][o.e.p.PluginsService     ] [transport_client_node_t0] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
[2016-11-18T21:30:50,923][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_t0] [test] creating index, cause [api], templates [random_index_template], shards [1]/[0], mappings [_default_]
[2016-11-18T21:30:51,215][INFO ][o.e.c.r.a.AllocationService] [node_t0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test][0]] ...]).
[2016-11-18T21:30:51,239][INFO ][o.e.p.PluginsService     ] [transport_client_node_tc2] no modules loaded
[2016-11-18T21:30:51,239][INFO ][o.e.p.PluginsService     ] [transport_client_node_tc2] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
[2016-11-18T21:30:51,279][WARN ][o.e.i.e.Engine           ] [node_t0] [test][0] failed engine [because I can]
java.lang.RuntimeException: because I can
    at org.elasticsearch.action.admin.indices.create.CreateIndexIT.testWeirdScenario(CreateIndexIT.java:90) ~[test/:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) ~[randomizedtesting-runner-2.4.0.jar:?]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:30:51,295][WARN ][o.e.i.c.IndicesClusterStateService] [node_t0] [[test][0]] marking and sending shard failed due to [shard failure, reason [because I can]]
java.lang.RuntimeException: because I can
    at org.elasticsearch.action.admin.indices.create.CreateIndexIT.testWeirdScenario(CreateIndexIT.java:90) ~[test/:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) ~[randomizedtesting-runner-2.4.0.jar:?]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:30:51,303][WARN ][o.e.c.a.s.ShardStateAction] [node_t0] [test][0] received shard failed for shard id [[test][0]], allocation id [gyqfvt5TQ2iaA5T66PDmpw], primary term [0], message [shard failure, reason [because I can]], failure [RuntimeException[because I can]]
java.lang.RuntimeException: because I can
    at org.elasticsearch.action.admin.indices.create.CreateIndexIT.testWeirdScenario(CreateIndexIT.java:90) ~[test/:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) ~[lucene-test-framework-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:12]
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) ~[randomizedtesting-runner-2.4.0.jar:?]
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) ~[randomizedtesting-runner-2.4.0.jar:?]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:30:51,342][INFO ][o.e.c.r.a.AllocationService] [node_t0] Cluster health status changed from [GREEN] to [RED] (reason: [shards failed [[test][0]] ...]).
[2016-11-18T21:30:56,359][ERROR][o.e.g.TransportNodesListGatewayStartedShards] [node_t0] [test][0] unable to acquire shard lock
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.store.Store.tryOpenIndex(Store.java:418) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:144) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:61) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:145) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:270) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:266) ~[main/:?]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[main/:?]
    at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:569) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:31:01,381][WARN ][o.e.i.c.IndicesClusterStateService] [node_t0] [[test][0]] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:01,384][WARN ][o.e.c.a.s.ShardStateAction] [node_t0] [test][0] received shard failed for shard id [[test][0]], allocation id [gyqfvt5TQ2iaA5T66PDmpw], primary term [0], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[test][0]: obtaining shard lock timed out after 5000ms]; ]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:06,397][ERROR][o.e.g.TransportNodesListGatewayStartedShards] [node_t0] [test][0] unable to acquire shard lock
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.store.Store.tryOpenIndex(Store.java:418) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:144) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:61) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:145) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:270) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:266) ~[main/:?]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[main/:?]
    at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:569) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:31:11,415][WARN ][o.e.i.c.IndicesClusterStateService] [node_t0] [[test][0]] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:11,416][WARN ][o.e.c.a.s.ShardStateAction] [node_t0] [test][0] received shard failed for shard id [[test][0]], allocation id [gyqfvt5TQ2iaA5T66PDmpw], primary term [0], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[test][0]: obtaining shard lock timed out after 5000ms]; ]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:16,442][ERROR][o.e.g.TransportNodesListGatewayStartedShards] [node_t0] [test][0] unable to acquire shard lock
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.store.Store.tryOpenIndex(Store.java:418) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:144) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:61) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:145) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:270) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:266) ~[main/:?]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[main/:?]
    at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:569) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:31:21,461][WARN ][o.e.i.c.IndicesClusterStateService] [node_t0] [[test][0]] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:21,462][WARN ][o.e.c.a.s.ShardStateAction] [node_t0] [test][0] received shard failed for shard id [[test][0]], allocation id [gyqfvt5TQ2iaA5T66PDmpw], primary term [0], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[test][0]: obtaining shard lock timed out after 5000ms]; ]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:26,474][ERROR][o.e.g.TransportNodesListGatewayStartedShards] [node_t0] [test][0] unable to acquire shard lock
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.store.Store.tryOpenIndex(Store.java:418) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:144) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:61) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:145) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:270) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:266) ~[main/:?]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[main/:?]
    at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:569) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:31:31,495][WARN ][o.e.i.c.IndicesClusterStateService] [node_t0] [[test][0]] marking and sending shard failed due to [failed to create shard]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:31,496][WARN ][o.e.c.a.s.ShardStateAction] [node_t0] [test][0] received shard failed for shard id [[test][0]], allocation id [gyqfvt5TQ2iaA5T66PDmpw], primary term [0], message [failed to create shard], failure [IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[test][0]: obtaining shard lock timed out after 5000ms]; ]
java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:513) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:147) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:539) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:516) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:205) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:298) ~[main/:?]
    ... 13 more
[2016-11-18T21:31:36,508][ERROR][o.e.g.TransportNodesListGatewayStartedShards] [node_t0] [test][0] unable to acquire shard lock
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 5000ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.index.store.Store.tryOpenIndex(Store.java:418) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:144) ~[main/:?]
    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:61) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:145) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:270) ~[main/:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:266) ~[main/:?]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[main/:?]
    at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:569) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:33:51,399][INFO ][o.e.a.a.i.c.CreateIndexIT] ensureGreen timed out, cluster state:
cluster uuid: -69XXLcyTXyBzLCeq-LCkQ
version: 21
state uuid: H1q20JyeRv68cn1FlhZDGg
from_diff: false
meta data version: 9
   [test/EwgEjaZoRm6lRojAS1by4A]: v[8]
      0: p_term [6], isa_ids [gyqfvt5TQ2iaA5T66PDmpw]
nodes: 
   {node_t1}{k-ggqJ2JRi-ql8PwqzzPBg}{xIGyXW7dSEWuXsHS1f8pbA}{127.0.0.1}{127.0.0.1:9401}
   {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}, master
   {node_tc2}{gN9UaHrxRpS0T2AJ8hJuGQ}{wVYUDMB0TtCoR3gWIdWXcA}{127.0.0.1}{127.0.0.1:9402}
routing_table (version 18):
-- index [[test/EwgEjaZoRm6lRojAS1by4A]]
----shard_id [test][0]
--------[test][0], node[null], [P], recovery_source[existing recovery], s[UNASSIGNED], unassigned_info[[reason=ALLOCATION_FAILED], at[2016-11-18T19:31:31.499Z], failed_attempts[5], delayed=false, details[failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: NotSerializableExceptionWrapper[shard_lock_obtain_failed_exception: [test][0]: obtaining shard lock timed out after 5000ms]; ], allocation_status[deciders_no]]

routing_nodes:
-----node_id[Smft58veRkeHiF6nXwjwTg][V]
-----node_id[k-ggqJ2JRi-ql8PwqzzPBg][V]
---- unassigned
--------[test][0], node[null], [P], recovery_source[existing recovery], s[UNASSIGNED], unassigned_info[[reason=ALLOCATION_FAILED], at[2016-11-18T19:31:31.499Z], failed_attempts[5], delayed=false, details[failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: NotSerializableExceptionWrapper[shard_lock_obtain_failed_exception: [test][0]: obtaining shard lock timed out after 5000ms]; ], allocation_status[deciders_no]]

tasks: (0):

[2016-11-18T21:33:51,399][INFO ][o.e.a.a.i.c.CreateIndexIT] [CreateIndexIT#testWeirdScenario]: finished test
[2016-11-18T21:33:51,400][INFO ][o.e.a.a.i.c.CreateIndexIT] [CreateIndexIT#testWeirdScenario]: cleaning up after test
[2016-11-18T21:33:51,467][WARN ][o.e.i.IndicesService     ] [node_t0] org.elasticsearch.indices.IndicesService$$Lambda$1323/1008235371@45d780ea
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: obtaining shard lock timed out after 0ms
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:685) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.lockAllForIndex(NodeEnvironment.java:550) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.deleteIndexDirectorySafe(NodeEnvironment.java:501) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.deleteIndexStoreIfDeletionAllowed(IndicesService.java:702) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.deleteIndexStore(IndicesService.java:689) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.deleteIndexStore(IndicesService.java:684) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.deleteUnassignedIndex(IndicesService.java:652) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndices(IndicesClusterStateService.java:264) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:193) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:780) ~[main/:?]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:965) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
[2016-11-18T21:33:51,506][INFO ][o.e.n.Node               ] [node_t0] stopping ...
[2016-11-18T21:33:51,509][INFO ][o.e.t.d.TestZenDiscovery ] [node_t1] master_left [{node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}], reason [shut_down]
[2016-11-18T21:33:51,510][WARN ][o.e.t.d.TestZenDiscovery ] [node_t1] master left (reason = shut_down), current nodes: nodes: 
   {node_t1}{k-ggqJ2JRi-ql8PwqzzPBg}{xIGyXW7dSEWuXsHS1f8pbA}{127.0.0.1}{127.0.0.1:9401}, local
   {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}, master
   {node_tc2}{gN9UaHrxRpS0T2AJ8hJuGQ}{wVYUDMB0TtCoR3gWIdWXcA}{127.0.0.1}{127.0.0.1:9402}

[2016-11-18T21:33:51,511][INFO ][o.e.t.d.MockZenPing      ] [node_t1] pinging using mock zen ping
[2016-11-18T21:33:51,517][INFO ][o.e.t.d.TestZenDiscovery ] [node_tc2] master_left [{node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}], reason [transport disconnected]
[2016-11-18T21:33:51,518][WARN ][o.e.t.d.TestZenDiscovery ] [node_tc2] master left (reason = transport disconnected), current nodes: nodes: 
   {node_t1}{k-ggqJ2JRi-ql8PwqzzPBg}{xIGyXW7dSEWuXsHS1f8pbA}{127.0.0.1}{127.0.0.1:9401}
   {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400}, master
   {node_tc2}{gN9UaHrxRpS0T2AJ8hJuGQ}{wVYUDMB0TtCoR3gWIdWXcA}{127.0.0.1}{127.0.0.1:9402}, local

[2016-11-18T21:33:51,520][INFO ][o.e.t.d.MockZenPing      ] [node_tc2] pinging using mock zen ping
[2016-11-18T21:33:51,521][INFO ][o.e.n.Node               ] [node_t0] stopped
[2016-11-18T21:33:51,521][INFO ][o.e.n.Node               ] [node_t0] closing ...
[2016-11-18T21:33:54,516][INFO ][o.e.t.d.MockZenPing      ] [node_t1] pinging using mock zen ping
[2016-11-18T21:33:54,526][INFO ][o.e.t.d.MockZenPing      ] [node_tc2] pinging using mock zen ping
[2016-11-18T21:33:57,518][INFO ][o.e.t.d.MockZenPing      ] [node_t1] pinging using mock zen ping
[2016-11-18T21:33:57,530][INFO ][o.e.t.d.MockZenPing      ] [node_tc2] pinging using mock zen ping
[2016-11-18T21:34:00,107][WARN ][o.e.c.NodeConnectionsService] [node_t1] failed to connect to node {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [node_t0][127.0.0.1:9400] general node connection failure
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:431) ~[main/:?]
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:387) ~[main/:?]
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:290) ~[main/:?]
    at org.elasticsearch.cluster.NodeConnectionsService.validateNodeConnected(NodeConnectionsService.java:113) ~[main/:?]
    at org.elasticsearch.cluster.NodeConnectionsService$ConnectionChecker.doRun(NodeConnectionsService.java:142) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_60]
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_60]
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_60]
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_60]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_60]
    at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_60]
    at org.elasticsearch.transport.MockTcpTransport.connectToChannels(MockTcpTransport.java:189) ~[main/:?]
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:413) ~[main/:?]
    ... 9 more
[2016-11-18T21:34:00,191][WARN ][o.e.c.NodeConnectionsService] [node_tc2] failed to connect to node {node_t0}{Smft58veRkeHiF6nXwjwTg}{RLXdAPCgQm2IV1bb4I7Ndw}{127.0.0.1}{127.0.0.1:9400} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [node_t0][127.0.0.1:9400] general node connection failure
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:431) ~[main/:?]
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:387) ~[main/:?]
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:290) ~[main/:?]
    at org.elasticsearch.cluster.NodeConnectionsService.validateNodeConnected(NodeConnectionsService.java:113) ~[main/:?]
    at org.elasticsearch.cluster.NodeConnectionsService$ConnectionChecker.doRun(NodeConnectionsService.java:142) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_60]
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_60]
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_60]
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_60]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_60]
    at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_60]
    at org.elasticsearch.transport.MockTcpTransport.connectToChannels(MockTcpTransport.java:189) ~[main/:?]
    at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:413) ~[main/:?]
    ... 9 more
[2016-11-18T21:34:00,521][INFO ][o.e.t.d.MockZenPing      ] [node_t1] pinging using mock zen ping
[2016-11-18T21:34:00,537][INFO ][o.e.t.d.MockZenPing      ] [node_tc2] pinging using mock zen ping
[2016-11-18T21:34:01,534][INFO ][o.e.n.Node               ] [node_t0] closed
[2016-11-18T21:34:01,534][WARN ][o.e.i.c.IndicesClusterStateService] [node_t0] [[test/EwgEjaZoRm6lRojAS1by4A]] failed to complete pending deletion for index
org.elasticsearch.env.ShardLockObtainFailedException: [test][0]: thread interrupted while trying to obtain shard lock
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:690) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:604) ~[main/:?]
    at org.elasticsearch.env.NodeEnvironment.lockAllForIndex(NodeEnvironment.java:550) ~[main/:?]
    at org.elasticsearch.indices.IndicesService.processPendingDeletes(IndicesService.java:977) ~[main/:?]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService$2.doRun(IndicesClusterStateService.java:295) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:527) ~[main/:?]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
Caused by: java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039) ~[?:1.8.0_60]
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) ~[?:1.8.0_60]
    at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409) ~[?:1.8.0_60]
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:684) ~[main/:?]
    ... 9 more
[2016-11-18T21:34:01,541][INFO ][o.e.n.Node               ] [node_t1] stopping ...
[2016-11-18T21:34:01,541][INFO ][o.e.t.d.MockZenPing      ] [node_tc2] pinging using mock zen ping
[2016-11-18T21:34:01,555][INFO ][o.e.n.Node               ] [node_t1] stopped
[2016-11-18T21:34:01,555][INFO ][o.e.n.Node               ] [node_t1] closing ...
[2016-11-18T21:34:01,559][INFO ][o.e.n.Node               ] [node_t1] closed
[2016-11-18T21:34:01,562][INFO ][o.e.n.Node               ] [node_tc2] stopping ...
[2016-11-18T21:34:01,567][INFO ][o.e.n.Node               ] [node_tc2] stopped
[2016-11-18T21:34:01,567][INFO ][o.e.n.Node               ] [node_tc2] closing ...
[2016-11-18T21:34:01,569][INFO ][o.e.n.Node               ] [node_tc2] closed
[2016-11-18T21:34:01,569][INFO ][o.e.a.a.i.c.CreateIndexIT] [CreateIndexIT#testWeirdScenario]: cleaned up after test

java.lang.AssertionError: timed out waiting for green state

    at __randomizedtesting.SeedInfo.seed([9FA981A46E11DF3D:271547EC56C03403]:0)
    at org.junit.Assert.fail(Assert.java:88)
    at org.elasticsearch.test.ESIntegTestCase.ensureColor(ESIntegTestCase.java:925)
    at org.elasticsearch.test.ESIntegTestCase.ensureGreen(ESIntegTestCase.java:891)
    at org.elasticsearch.action.admin.indices.create.CreateIndexIT.testWeirdScenario(CreateIndexIT.java:98)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
    at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
    at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
    at java.lang.Thread.run(Thread.java:745)

REPRODUCE WITH: gradle null -Dtests.seed=9FA981A46E11DF3D -Dtests.class=org.elasticsearch.action.admin.indices.create.CreateIndexIT -Dtests.method="testWeirdScenario" -Dtests.locale=en-AU -Dtests.timezone=Europe/Istanbul
NOTE: leaving temporary files on disk at: /private/var/folders/68/3gzf12zs4qb0q_gfjw5lx1fm0000gn/T/org.elasticsearch.action.admin.indices.create.CreateIndexIT_9FA981A46E11DF3D-001
NOTE: test params are: codec=Asserting(Lucene62): {}, docValues:{}, maxPointsInLeafNode=1098, maxMBSortInHeap=5.330974165069652, sim=ClassicSimilarity, locale=en-AU, timezone=Europe/Istanbul
NOTE: Mac OS X 10.12.1 x86_64/Oracle Corporation 1.8.0_60 (64-bit)/cpus=4,threads=1,free=211766840,total=311427072
NOTE: All tests run in this JVM: [CreateIndexIT]

Process finished with exit code 255

@abeyad
Copy link

abeyad commented Nov 18, 2016

@ywelsch thanks for confirming

@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 18, 2016

Playing around with the above test case I noticed that ShardLockObtainFailedException was not serializable, which is fixed by d8a6b91.

@dakrone
Copy link
Member

dakrone commented Nov 18, 2016

I noticed that ShardLockObtainFailedException was not serializable, which is fixed by d8a6b91.

Wouldn't that make this non-backport-able, since 5.0 won't have the serialization logic for this exception?

@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 18, 2016

Wouldn't that make this non-backport-able, since 5.0 won't have the serialization logic for this exception?

correct 😞 We need to rethink this. Any suggestions?

@bleskes
Copy link
Contributor

bleskes commented Nov 19, 2016

Any suggestions?

I think we have 3 options

  1. Don't return the shard lock exception as a store exception, with the down sides you described.
  2. Build a BWC layer into NodeGatewayStartedShards#writeTo translating that exception into an IOException (and the same for the read side)
  3. Build a BWC layer intot he exception handling logic , which we will need at some point anyway.

Given the time frame of this and the aim to have it in 5.1, I tend towards option 2.

s1monw added a commit to s1monw/elasticsearch that referenced this pull request Nov 21, 2016
Today it's not possible to add exceptions to the serialization layer
without breaking BWC. This commit adds the ability to specify the Version
an exception was added that allows to fall back not NotSerializableExceptionWrapper
if the expection is not present in the streams version.

Relates to elastic#21656
s1monw added a commit that referenced this pull request Nov 21, 2016
Today it's not possible to add exceptions to the serialization layer
without breaking BWC. This commit adds the ability to specify the Version
an exception was added that allows to fall back not NotSerializableExceptionWrapper
if the exception is not present in the streams version.

Relates to #21656
s1monw added a commit that referenced this pull request Nov 21, 2016
Today it's not possible to add exceptions to the serialization layer
without breaking BWC. This commit adds the ability to specify the Version
an exception was added that allows to fall back not NotSerializableExceptionWrapper
if the exception is not present in the streams version.

Relates to #21656
@ywelsch ywelsch force-pushed the fix/gracefully-handle-shardlockobtainfailedexception branch from d8a6b91 to c4a123b Compare November 21, 2016 12:11
@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 21, 2016

@s1monw has added a BWC layer for exceptions in #21694. I've rebased this PR so that it includes his changes -> It's ready for review again.

@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 22, 2016

Can I get another review on this?

ywelsch pushed a commit that referenced this pull request Nov 22, 2016
Today it's not possible to add exceptions to the serialization layer
without breaking BWC. This commit adds the ability to specify the Version
an exception was added that allows to fall back not NotSerializableExceptionWrapper
if the exception is not present in the streams version.

Relates to #21656
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM (left some nits). I do think we need to beef up the test a bit.

if (inSyncAllocationIds.contains(allocationId)) {
if (nodeShardState.primary()) {
// put shards that were primary before and that didn't throw a ShardLockObtainFailedException first
if (nodeShardState.primary() && nodeShardState.storeException() == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we assert that the store exception is what we expect it to be?

} else if (matchAnyShard) {
if (nodeShardState.primary()) {
// put shards that were primary before and that didn't throw a ShardLockObtainFailedException first
if (nodeShardState.primary() && nodeShardState.storeException() == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same request for assertion.

if (nodeShardState.storeException() instanceof ShardLockObtainFailedException) {
logger.trace((Supplier<?>) () -> new ParameterizedMessage("[{}] on node [{}] has version [{}] but the store can not be opened as it's locked, treating as valid shard", shard, nodeShardState.getNode(), finalVersion), nodeShardState.storeException());
if (nodeShardState.allocationId() != null) {
version = Long.MAX_VALUE; // shard was already selected in a 5.x cluster as primary, prefer this shard copy again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you run into this being a problem? how can we open an index that was created before 5.0.0 and never had insync replicas but does have allocationId? the only thing I can think of is a node network issue during shard initialization. I'm wondering if we need to optimize for this and no keep this code simple (i.e., demote shards with a lock exception)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We special-case this a few lines up as well (but it's not easy to do code reuse across those lines). For symmetry reasons I have kept it as is. The code is documented as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough


/**
* Tests that when the node returns a ShardLockObtainFailedException, it will be considered as a valid shard copy
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add tests for cases with more shards? for example the case where we "prefer" other shard copies with this exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've added more tests

@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 22, 2016

@bleskes I've pushed 3e64d74 addressing comments.

}
}

List<NodeGatewayStartedShards> nodeShardStates = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay!

if (matchAnyShard) {
// prefer shards with matching allocation ids
Comparator<NodeGatewayStartedShards> matchingAllocationsFirst = Comparator.comparing(
(NodeGatewayStartedShards state) -> inSyncAllocationIds.contains(state.allocationId())).reversed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fancy pants.

if (nodeShardState.storeException() instanceof ShardLockObtainFailedException) {
logger.trace((Supplier<?>) () -> new ParameterizedMessage("[{}] on node [{}] has version [{}] but the store can not be opened as it's locked, treating as valid shard", shard, nodeShardState.getNode(), finalVersion), nodeShardState.storeException());
if (nodeShardState.allocationId() != null) {
version = Long.MAX_VALUE; // shard was already selected in a 5.x cluster as primary, prefer this shard copy again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ywelsch

@ywelsch ywelsch merged commit a446557 into elastic:master Nov 22, 2016
@ywelsch
Copy link
Contributor Author

ywelsch commented Nov 22, 2016

Thanks @bleskes @abeyad @dakrone

ywelsch added a commit that referenced this pull request Nov 22, 2016
…ked during shard state fetching (#21656)

PR #19416 added a safety mechanism to shard state fetching to only access the store when the shard lock can be acquired. This can lead to the following situation however where a shard has not fully shut down yet while the shard fetching is going on, resulting in a ShardLockObtainFailedException. PrimaryShardAllocator that decides where to allocate primary shards sees this exception and treats the shard as unusable. If this is the only shard copy in the cluster, the cluster stays red and a new shard fetching cycle will not be triggered as shard state fetching treats exceptions while opening the store as permanent failures.

This commit makes it so that PrimaryShardAllocator treats the locked shard as a possible allocation target (although with the least priority).
ywelsch added a commit that referenced this pull request Nov 22, 2016
…ked during shard state fetching (#21656)

PR #19416 added a safety mechanism to shard state fetching to only access the store when the shard lock can be acquired. This can lead to the following situation however where a shard has not fully shut down yet while the shard fetching is going on, resulting in a ShardLockObtainFailedException. PrimaryShardAllocator that decides where to allocate primary shards sees this exception and treats the shard as unusable. If this is the only shard copy in the cluster, the cluster stays red and a new shard fetching cycle will not be triggered as shard state fetching treats exceptions while opening the store as permanent failures.

This commit makes it so that PrimaryShardAllocator treats the locked shard as a possible allocation target (although with the least priority).
@lcawl lcawl added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocker >bug critical :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. v5.0.2 v5.1.1 v6.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants