Recover from DNS outage on startup

If you start an elasticsearch node, that has trouble with DNS, it will never recover from this and continue spitting exceptions, even if the DNS problems are fixed. The reason for this is, that in `UnicastZenPing` constructor we have the following code:

```
        for (String host : hosts) {
            try {
                TransportAddress[] addresses = transportService.addressesFromString(host);
                // we only limit to 1 addresses, makes no sense to ping 100 ports
                for (int i = 0; (i < addresses.length && i < LIMIT_PORTS_COUNT); i++) {
                    configuredTargetNodes.add(new DiscoveryNode(UNICAST_NODE_PREFIX + unicastNodeIdGenerator.incrementAndGet() + "#", addresses[i], version.minimumCompatibilityVersion()));
                }
            } catch (Exception e) {
                throw new ElasticsearchIllegalArgumentException("Failed to resolve address for [" + host + "]", e);
            }
        }
        this.configuredTargetNodes = configuredTargetNodes.toArray(new DiscoveryNode[configuredTargetNodes.size()]);
```

`transportService.addressesFromString(host)` calls `InetSocketAddress` which in turn tries to resolve the applied hostname and fails, thus marking returning `InetSocketAddress.isResolved()` as `false` - forever. This method is used by netty to check if connecting to the endpoint makes sense at all.
### How to reproduce locally

If you want to reproduce, take this config and disable network on your system (will work when network is enabled, as `localhost.spinscale.de` resolves to 127.0.0.1.

```
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["localhost.spinscale.de:9300" ]
```
### Fix proposal
1. First, remove the exception output, catch `UnresolvedAddressException` in `UnicastZenPing.sendPings()` and log a single line, telling the problem including the hostname
2. Make sure the `InetAddress` and its `isResolved()` method is not cached. Not sure what is the best approach here, either create the InetSocketAddress object before each connect try or maybe there are some configurable properties around this


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recover from DNS outage on startup #10186

How to reproduce locally

Fix proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recover from DNS outage on startup #10186

Description

How to reproduce locally

Fix proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions