Skip to content

5.6.10 to 6.3.0 rolling upgrade broken with 'commit doesn't contain history uuid' when a synced flush is performed #31482

@praseodym

Description

@praseodym

Rolling upgrades of an Elasticsearch 5.6.10 cluster to version 6.3.0 fail with a java.lang.IllegalStateException: commit doesn't contain history uuid when a synced flush (_flush/synced) is performed, as described in the rolling upgrade documentation.

Steps to reproduce:

  1. Start multi-node 5.6.10 cluster
  2. Index some data
  3. Disable shard allocation
  4. Perform a synced flush
  5. Shut down and upgrade one of the nodes
  6. Reenable shard allocation
  7. Node joins the cluster but never fully starts

I cannot reproduce the problem without performing the synced flush. I think this problem could have been introduced in #28245.

Reproduction script, takes about a minute to reproduce the issue
#!/bin/bash
set -ex

# Setup
docker rm -f es1 || true
docker rm -f es2 || true
docker network inspect es || docker network create es
rm -rf /tmp/esdata
mkdir -p /tmp/esdata/data1 /tmp/esdata/data2 /tmp/esdata/snapshot
sudo chown -R 1000:1000 /tmp/esdata
sudo sysctl -w vm.max_map_count=262144

# Start two-node Elasticsearch 5.6.10 cluster
docker run -d --name es1 --net es -v /tmp/esdata/data1:/usr/share/elasticsearch/data -v /tmp/esdata/snapshot:/snapshot -e path.repo=/snapshot -e xpack.security.enabled=false -e discovery.zen.ping.unicast.hosts=es2 -p 127.0.0.1:9200:9200 docker.elastic.co/elasticsearch/elasticsearch:5.6.10
docker run -d --name es2 --net es -v /tmp/esdata/data2:/usr/share/elasticsearch/data -v /tmp/esdata/snapshot:/snapshot -e path.repo=/snapshot -e xpack.security.enabled=false -e discovery.zen.ping.unicast.hosts=es1 -p 127.0.0.1:9201:9200 docker.elastic.co/elasticsearch/elasticsearch:5.6.10
while ! http 127.0.0.1:9200/_cluster/health?wait_for_status=green; do sleep 1; done

# Index some sample data
curl https://download.elastic.co/demos/kibana/gettingstarted/shakespeare_6.0.json | curl -H 'Content-Type: application/x-ndjson' -XPOST '127.0.0.1:9200/shakespeare/doc/_bulk?pretty' --data-binary @-

# Perform rolling upgrade tp 6.3.0 according to docs at
# https://www.elastic.co/guide/en/elasticsearch/reference/current/rolling-upgrades.html

# Step 1: disable shard allocation
http PUT 127.0.0.1:9200/_cluster/settings persistent:='{"cluster.routing.allocation.enable": "none"}'

# Step 2: stop non-essential indexing and perform a synced flush
# Without this step, the upgrade goes well!
http POST 127.0.0.1:9200/_flush/synced

# Step 4: shut down a single node
docker stop es2
docker rm es2

# Step 5, 7: upgrade and start that node
docker run -d --name es2 --net es -v /tmp/esdata/data2:/usr/share/elasticsearch/data -v /tmp/esdata/snapshot:/snapshot -e path.repo=/snapshot -e discovery.zen.ping.unicast.hosts=es1 -p 127.0.0.1:9201:9200 docker.elastic.co/elasticsearch/elasticsearch:6.3.0
while ! http 127.0.0.1:9201; do sleep 1; done

# Step 8: reenable shard allocation
http --check-status PUT 127.0.0.1:9200/_cluster/settings persistent:='{"cluster.routing.allocation.enable": null}'

# Watch mayhem ensue
docker logs -f es2
Log including stack traces from the upgraded node
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
[2018-06-20T21:38:02,917][INFO ][o.e.n.Node               ] [] initializing ...
[2018-06-20T21:38:02,958][INFO ][o.e.e.NodeEnvironment    ] [uLAJsY1] using [1] data paths, mounts [[/usr/share/elasticsearch/data (tmpfs)]], net usable_space [15.6gb], net total_space [15.7gb], types [tmpfs]
[2018-06-20T21:38:02,959][INFO ][o.e.e.NodeEnvironment    ] [uLAJsY1] heap size [989.8mb], compressed ordinary object pointers [true]
[2018-06-20T21:38:02,972][INFO ][o.e.n.Node               ] [uLAJsY1] node name derived from node ID [uLAJsY1xT5yhCUzAvNa8ag]; set [node.name] to override
[2018-06-20T21:38:02,972][INFO ][o.e.n.Node               ] [uLAJsY1] version[6.3.0], pid[1], build[default/tar/424e937/2018-06-11T23:38:03.357887Z], OS[Linux/4.17.2-1-ARCH/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/10.0.1/10.0.1+10]
[2018-06-20T21:38:02,972][INFO ][o.e.n.Node               ] [uLAJsY1] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.jX5EEUqv, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
[2018-06-20T21:38:04,206][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [aggs-matrix-stats]
[2018-06-20T21:38:04,206][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [analysis-common]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [ingest-common]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [lang-expression]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [lang-mustache]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [lang-painless]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [mapper-extras]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [parent-join]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [percolator]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [rank-eval]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [reindex]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [repository-url]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [transport-netty4]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [tribe]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-core]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-deprecation]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-graph]
[2018-06-20T21:38:04,207][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-logstash]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-ml]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-monitoring]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-rollup]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-security]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-sql]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-upgrade]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded module [x-pack-watcher]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded plugin [ingest-geoip]
[2018-06-20T21:38:04,208][INFO ][o.e.p.PluginsService     ] [uLAJsY1] loaded plugin [ingest-user-agent]
[2018-06-20T21:38:06,118][INFO ][o.e.x.s.a.s.FileRolesStore] [uLAJsY1] parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]
[2018-06-20T21:38:06,428][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/172] [Main.cc@109] controller (64 bit): Version 6.3.0 (Build 0f0a34c67965d7) Copyright (c) 2018 Elasticsearch BV
[2018-06-20T21:38:06,632][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,634][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,640][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,641][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,643][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,644][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,644][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,644][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,645][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,646][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,647][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,648][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,650][WARN ][o.e.d.c.m.IndexTemplateMetaData] Deprecated field [template] used, replaced by [index_patterns]
[2018-06-20T21:38:06,865][INFO ][o.e.d.DiscoveryModule    ] [uLAJsY1] using discovery type [zen]
[2018-06-20T21:38:07,373][INFO ][o.e.n.Node               ] [uLAJsY1] initialized
[2018-06-20T21:38:07,373][INFO ][o.e.n.Node               ] [uLAJsY1] starting ...
[2018-06-20T21:38:07,481][INFO ][o.e.t.TransportService   ] [uLAJsY1] publish_address {172.19.0.3:9300}, bound_addresses {0.0.0.0:9300}
[2018-06-20T21:38:07,497][INFO ][o.e.b.BootstrapChecks    ] [uLAJsY1] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-20T21:38:10,646][INFO ][o.e.c.s.ClusterApplierService] [uLAJsY1] detected_master {4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}, added {{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true},}, reason: apply cluster state (from master [master {4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true} committed version [36]])
[2018-06-20T21:38:10,651][INFO ][o.e.c.s.ClusterSettings  ] [uLAJsY1] updating [cluster.routing.allocation.enable] from [all] to [none]
[2018-06-20T21:38:10,827][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [uLAJsY1] Failed to clear cache for realms [[]]
[2018-06-20T21:38:10,837][INFO ][o.e.l.LicenseService     ] [uLAJsY1] license [3d2953c0-7b27-4738-861b-091c92a4fd31] mode [trial] - valid
[2018-06-20T21:38:10,865][INFO ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [uLAJsY1] publish_address {172.19.0.3:9200}, bound_addresses {0.0.0.0:9200}
[2018-06-20T21:38:10,865][INFO ][o.e.n.Node               ] [uLAJsY1] started
[2018-06-20T21:38:10,894][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-06-20T21:38:10,925][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-06-20T21:38:10,954][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-06-20T21:38:11,381][INFO ][o.e.c.s.ClusterSettings  ] [uLAJsY1] updating [cluster.routing.allocation.enable] from [none] to [all]
[2018-06-20T21:38:11,392][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-06-20T21:38:11,529][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-06-20T21:38:11,592][WARN ][o.e.i.c.IndicesClusterStateService] [uLAJsY1] [[shakespeare][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [shakespeare][0]: Recovery failed from {4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true} into {uLAJsY1}{uLAJsY1xT5yhCUzAvNa8ag}{J4vNZ9OETdeO8pxepzmRHw}{172.19.0.3}{172.19.0.3:9300}{ml.machine_memory=33728278528, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:282) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:80) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:623) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:844) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [4E_A_7z][172.19.0.2:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] phase1 failed
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:140) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: Failed to transfer [0] files with total size of [0b]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:337) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [uLAJsY1][172.19.0.3:9300][internal:index/shard/recovery/prepare_translog]
Caused by: java.lang.IllegalStateException: commit doesn't contain history uuid
	at org.elasticsearch.index.engine.InternalEngine.loadHistoryUUID(InternalEngine.java:493) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:193) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:157) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:2152) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:2134) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1341) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.openEngineAndSkipTranslogRecovery(IndexShard.java:1305) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoveryTarget.prepareForTranslogOperations(RecoveryTarget.java:366) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:403) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:397) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:246) ~[?:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:304) ~[?:?]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1592) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:844) ~[?:?]
[2018-06-20T21:38:11,602][WARN ][o.e.i.c.IndicesClusterStateService] [uLAJsY1] [[.monitoring-es-6-2018.06.20][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [.monitoring-es-6-2018.06.20][0]: Recovery failed from {4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true} into {uLAJsY1}{uLAJsY1xT5yhCUzAvNa8ag}{J4vNZ9OETdeO8pxepzmRHw}{172.19.0.3}{172.19.0.3:9300}{ml.machine_memory=33728278528, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:282) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:80) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:623) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:844) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [4E_A_7z][172.19.0.2:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] phase1 failed
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:140) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: Failed to transfer [0] files with total size of [0b]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:337) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [uLAJsY1][172.19.0.3:9300][internal:index/shard/recovery/prepare_translog]
Caused by: java.lang.IllegalStateException: commit doesn't contain history uuid
	at org.elasticsearch.index.engine.InternalEngine.loadHistoryUUID(InternalEngine.java:493) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:193) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:157) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:2152) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:2134) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1341) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.openEngineAndSkipTranslogRecovery(IndexShard.java:1305) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoveryTarget.prepareForTranslogOperations(RecoveryTarget.java:366) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:403) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:397) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:246) ~[?:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:304) ~[?:?]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1592) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:844) ~[?:?]
[2018-06-20T21:38:11,634][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2018-06-20T21:38:11,657][WARN ][o.e.i.c.IndicesClusterStateService] [uLAJsY1] [[shakespeare][3]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [shakespeare][3]: Recovery failed from {4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true} into {uLAJsY1}{uLAJsY1xT5yhCUzAvNa8ag}{J4vNZ9OETdeO8pxepzmRHw}{172.19.0.3}{172.19.0.3:9300}{ml.machine_memory=33728278528, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:282) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:80) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:623) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:844) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [4E_A_7z][172.19.0.2:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] phase1 failed
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:140) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: Failed to transfer [0] files with total size of [0b]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:337) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [uLAJsY1][172.19.0.3:9300][internal:index/shard/recovery/prepare_translog]
Caused by: java.lang.IllegalStateException: commit doesn't contain history uuid
	at org.elasticsearch.index.engine.InternalEngine.loadHistoryUUID(InternalEngine.java:493) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:193) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:157) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:2152) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:2134) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1341) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.openEngineAndSkipTranslogRecovery(IndexShard.java:1305) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoveryTarget.prepareForTranslogOperations(RecoveryTarget.java:366) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:403) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:397) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:246) ~[?:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:304) ~[?:?]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1592) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:844) ~[?:?]
[2018-06-20T21:38:11,669][WARN ][o.e.i.c.IndicesClusterStateService] [uLAJsY1] [[.watches][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [.watches][0]: Recovery failed from {4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true} into {uLAJsY1}{uLAJsY1xT5yhCUzAvNa8ag}{J4vNZ9OETdeO8pxepzmRHw}{172.19.0.3}{172.19.0.3:9300}{ml.machine_memory=33728278528, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:282) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:80) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:623) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) [elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:844) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [4E_A_7z][172.19.0.2:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[1] phase1 failed
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:140) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: Failed to transfer [0] files with total size of [0b]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:337) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:132) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$100(PeerRecoverySourceService.java:54) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:141) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:138) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:?]
	at java.lang.Thread.run(Thread.java:748) ~[?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [uLAJsY1][172.19.0.3:9300][internal:index/shard/recovery/prepare_translog]
Caused by: java.lang.IllegalStateException: commit doesn't contain history uuid
	at org.elasticsearch.index.engine.InternalEngine.loadHistoryUUID(InternalEngine.java:493) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:193) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:157) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:2152) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:2134) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1341) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.index.shard.IndexShard.openEngineAndSkipTranslogRecovery(IndexShard.java:1305) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.RecoveryTarget.prepareForTranslogOperations(RecoveryTarget.java:366) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:403) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:397) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:246) ~[?:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:304) ~[?:?]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1592) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724) ~[elasticsearch-6.3.0.jar:6.3.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.3.0.jar:6.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:844) ~[?:?]
[2018-06-20T21:38:11,681][INFO ][o.e.x.m.e.l.LocalExporter] waiting for elected master node [{4E_A_7z}{4E_A_7zATUu6ebxzJFhMrg}{JxDu4xcyTWKdshEZqUgKQw}{172.19.0.2}{172.19.0.2:9300}{ml.max_open_jobs=10, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)

--- cut, Elasticsearch never seems to recover from this ---

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions