-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Closed
Copy link
Labels
Description
Elasticsearch version (bin/elasticsearch --version): 5.6.5
Description of the problem including expected versus actual behavior:
From this discussion: https://discuss.elastic.co/t/cant-upgrade-elasticsearch-5-2-1-to-5-6-5/115403
When a user set on an index a weird setting like: "index.routing.allocation.exclude.tag": null (was done in 5.2), when trying to restart the cluster, the master node is sending NPE:
[2018-01-14T17:36:24,392][WARN ][o.e.d.z.ZenDiscovery ] [xg-ops-elk-javaes-mgt-2] failed to validate incoming join request from node [{xg-ops-elk-javaes-mgt-3}{SQaSuQ1aS-izcNs4P9yItQ}{_Wc_mrnfS5Ghb3RjuJLiJg}{10.0.23.55}{10.0.23.55:9300}]
org.elasticsearch.transport.RemoteTransportException: [xg-ops-elk-javaes-mgt-3][10.0.23.55:9300][internal:discovery/zen/join/validate]
Caused by: java.lang.NullPointerException
at org.elasticsearch.cluster.node.DiscoveryNodeFilters.buildFromKeyValue(DiscoveryNodeFilters.java:73) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.cluster.metadata.IndexMetaData$Builder.build(IndexMetaData.java:1044) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.cluster.metadata.IndexMetaData.readFrom(IndexMetaData.java:724) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.cluster.metadata.MetaData.readFrom(MetaData.java:676) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.cluster.ClusterState.readFrom(ClusterState.java:659) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.discovery.zen.MembershipAction$ValidateJoinRequest.readFrom(MembershipAction.java:171) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1510) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1396) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_73]
As this can happen, I think we should try to be a bit safer when exclude or include values are null.
The code which is failing: https://github.com/elastic/elasticsearch/blob/5.6/core/src/main/java/org/elasticsearch/cluster/node/DiscoveryNodeFilters.java#L69-L81
String[] values = Strings.tokenizeToStringArray(entry.getValue(), ",");
if (values.length > 0) {
// ...values is null in that case.