Skip to content

node fails to join cluster after upgrade 6.54 -> 6.7.0 #40784

@ItamarBenjamin

Description

@ItamarBenjamin

** this does not seem related to #40565

Elasticsearch version (bin/elasticsearch --version): cluster is 6.5.4, upgraded single node to 6.7.0. all docker builds

Plugins installed: none

JVM version (java -version): docker provided jvm

OS version (uname -a if on a Unix-like system): 4.15.0-33-generic #36~16.04.1-Ubuntu

Description of the problem including expected versus actual behavior:

upgraded a single node on 5 different clusters, all clusters suffered the same issue. upgraded node was not able to rejoin the cluster and remained yellow. to fix had to upgrade the other nodes to 6.7.0, causing downtime.

Steps to reproduce:
run a 3 6.5.4 nodes cluster on docker, upgrade a single node to 6.7.0

Provide logs (if relevant):
[2019-04-03T06:30:56,549][WARN ][o.e.d.z.PublishClusterStateAction] [fmyinf7012] publishing cluster state with version [29318] failed for the following nodes: [[{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true}]] [2019-04-03T06:30:56,563][INFO ][o.e.c.s.MasterService ] [fmyinf7012] zen-disco-node-failed({fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true}), reason(transport disconnected)[{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true} transport disconnected, {fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true} transport disconnected], reason: removed {{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true},} [2019-04-03T06:30:56,569][INFO ][o.e.c.s.ClusterApplierService] [fmyinf7012] removed {{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true},}, reason: apply cluster state (from master [master {fmyinf7012}{K0Od0eOBTp6c8PRggjBpuw}{_1KFcV3zRryhiuGz7cCQwQ}{10.96.87.26}{10.96.87.26:9300}{xpack.installed=true} committed version [29319] source [zen-disco-node-failed({fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true}), reason(transport disconnected)[{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true} transport disconnected, {fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true} transport disconnected]]]) [2019-04-03T06:31:00,580][INFO ][o.e.c.s.MasterService ] [fmyinf7012] zen-disco-node-join[{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true}], reason: added {{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true},} [2019-04-03T06:31:00,635][INFO ][o.e.c.s.ClusterApplierService] [fmyinf7012] added {{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true},}, reason: apply cluster state (from master [master {fmyinf7012}{K0Od0eOBTp6c8PRggjBpuw}{_1KFcV3zRryhiuGz7cCQwQ}{10.96.87.26}{10.96.87.26:9300}{xpack.installed=true} committed version [29320] source [zen-disco-node-join[{fmyinf7011}{ejFPzwZHRxqk4CNBtF66iA}{x1NPhmVuQcucbPX3VeEgPg}{10.96.87.25}{10.96.87.25:9300}{xpack.installed=true}]]]) [2019-04-03T06:31:06,665][WARN ][o.e.t.n.Netty4Transport ] [fmyinf7012] exception caught on transport layer [NettyTcpChannel{localAddress=/10.96.87.26:58004, remoteAddress=10.96.87.25/10.96.87.25:9300}], closing connection java.lang.IllegalStateException: Message not fully read (response) for requestId [55772288], handler [org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler/org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1@58fac40a], error [false]; resetting at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1197) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:65) ~[transport-netty4-client-6.5.4.jar:6.5.4] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [netty-codec-4.1.30.Final.jar:4.1.30.Final] at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) [netty-codec-4.1.30.Final.jar:4.1.30.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:426) [netty-codec-4.1.30.Final.jar:4.1.30.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) [netty-codec-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [netty-transport-4.1.30.Final.jar:4.1.30.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [netty-common-4.1.30.Final.jar:4.1.30.Final] at java.lang.Thread.run(Thread.java:834) [?:?]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions