-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Lazy resolve unicast hosts #21630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy resolve unicast hosts #21630
Conversation
88368c0 to
4ab68db
Compare
Today we eagerly resolve unicast hosts. This means that if DNS changes, we will never find the host at the new address. Moreover, a single host failng to resolve causes startup to abort. This commit introduces lazy resolution of unicast hosts. If a DNS entry changes, there is an opportunity for the host to be discovered. Note that under the Java security manager, there is a default positive cache of infinity for resolved hosts; this means that if a user does want to operate in an environment where DNS can change, they must adjust networkaddress.cache.ttl in their security policy. And if a host fails to resolve, we warn log the hostname but continue pinging other configured hosts.
e3dc063 to
acdc4bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might decide this is okay, but one question is whether these lookups should be throttled at all as right now it is just spamming the generic thread pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should use unicastConnectExecutor for this and keep it contained (and throttled).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I went with this approach is since this method is also used by the file-based unicast hosts provider so it would need its own executor and we do not have a close method for plugins to allow this executor be shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually plugins can implement Closeable and they will be closed when the node shuts down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was looking for it on the Plugins base class, but I see we do close a plugin if it implements java.io.Closeable. I think this should be made more clear and opened #21669.
acdc4bc to
f2d6f13
Compare
When doing DNS resolutions for unicast hostnames, we wait until the DNS lookups timeout. This appears to be forty-five seconds on modern JVMs, and it is not configurable. If we do these serially, the cluster can be blocked during ping for a lengthy period of time. This commit introduces doing the DNS lookups in parallel on the generic thread pool, and adds a user-configurable timeout for these lookups.
f2d6f13 to
cc7d53e
Compare
This commit properly sets up test resources for UnicastZenPingTests.
bleskes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code and left some minor comments. I have some concerns with the approach because it now delays ping to the extent of the host resolve timeouts, regardless of the incoming ping timeout. I wonder if we should use something like SingleObjectCache to do this refreshes in the background.
| private final Map<Integer, SendPingsHandler> receivedResponses = newConcurrentMap(); | ||
|
|
||
| // a list of temporal responses a node will return for a request (holds requests from other configuredTargetNodes) | ||
| // a list of temporal responses a node will return for a request (holds requests from other configuredHosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's just other nodes - we are pinged by any node in the previous cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 4868321.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should use unicastConnectExecutor for this and keep it contained (and throttled).
| logger.warn(message, e); | ||
| } | ||
| } else { | ||
| logger.warn("timed out resolving host [{}]", hostname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add the timeout value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed f6203d7.
This was just a silly oops, this commit fixes the swallowing of an interrupted exception in the file-based unicast hosts provider.
This commit adds the resolve timeout to the warn log message displayed when we timeout resolving unicast hosts.
This commit fixes an incorrect comment in UnicastZenPing; the comment incorrectly specified which responses is held by the temporal responses.
The timeout can be set low (and maybe the default timeout should be set lower). The DNS entries are cached indefinitely (by the JVM) unless someone changes the security settings, so this is only an issue for users that have changing DNS entries and have changed the DNS cache TTL.
I'm not sure, if someone loses their cluster because DNS changes now they have to wait for the cache TTL to expire and for a background refresh to complete before their cluster can reform. |
* master: (42 commits) Add support for merging custom meta data in tribe node (elastic#21552) [DOCS] Show EC2's auto attribute (elastic#21474) Add information about the removal of store throttling to the migration guide. Add a recommendation against large documents to the docs. (elastic#21652) Add indices options tests to search api REST tests (elastic#21701) Fixing indentation in geospatial querying example. (elastic#21682) Fix typo in filters aggregation docs (elastic#21690) Add BWC layer for Exceptions (elastic#21694) Add checkstyle rule to forbid empty javadoc comments (elastic#20881) Docs: Added offline install link for discovery-file plugin remove pointless catch exception in TransportSearchAction (elastic#21689) Rename ClusterState#lookupPrototypeSafe to `lookupPrototype` and remove previous "unsafe" unused variant (elastic#21686) Use a buffer to do character to byte conversion in StreamOutput#writeString (elastic#21680) Fix integer overflows when dealing with templates. (elastic#21628) Fix highlighting on a stored keyword field (elastic#21645) Set execute permissions for native plugin programs (elastic#21657) adjust visibility of DiscoveryNodes.Delta constructor Remove unused DiscoveryNodes.Delta constructor Remove unused DiscoveryNode#removeDeadMembers public method Remove minNodeVersion and corresponding public `getSmallestVersion` getter method from DiscoveryNodes ...
This commit refactors the resolution of unicast hosts from hostnames to IP addresses to use an executor service instead of the generic thread pool.
d0b2750 to
a28dc6e
Compare
bleskes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left q regarding a potential simplification. I also really like the test change.
For posterity - the reason we opted to go with resolve-on-ping rather and than resolve-in-thebackground is that in FileDiscoveryPlugin we also do dns resolve on ping. Changing that will need a big change but without it we felt there is no need to go on a heroic effort to not delays pings on a slow dns. We may revisit in the future.
| for (TransportAddress address : addresses) { | ||
| discoveryNodes.add(new DiscoveryNode(idGenerator.get(), address, emptyMap(), emptySet(), | ||
| Version.CURRENT.minimumCompatibilityVersion())); | ||
| public static List<DiscoveryNode> resolveDiscoveryNodes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get rid of the ResolvedHostname abstraction - what am I missing?
diff --git a/core/src/main/java/org/elasticsearch/discovery/zen/UnicastZenPing.java b/core/src/main/java/org/elasticsearch/discovery/zen/UnicastZenPing.java
index 61bf1cc..3d3495e 100644
--- a/core/src/main/java/org/elasticsearch/discovery/zen/UnicastZenPing.java
+++ b/core/src/main/java/org/elasticsearch/discovery/zen/UnicastZenPing.java
@@ -242,40 +242,36 @@ public class UnicastZenPing extends AbstractComponent implements ZenPing {
throw new IllegalArgumentException("resolve timeout must be non-negative but was [" + resolveTimeout + "]");
}
// create tasks to submit to the executor service; we will wait up to resolveTimeout for these tasks to complete
- final List<Callable<ResolvedHostname>> callables =
+ final List<Callable<TransportAddress[]>> callables =
hosts.stream().map(hn -> lookup(hn, transportService, limitPortCounts)).collect(Collectors.toList());
- final List<Future<ResolvedHostname>> futures =
+ final List<Future<TransportAddress[]>> futures =
executorService.invokeAll(callables, resolveTimeout.nanos(), TimeUnit.NANOSECONDS);
final List<DiscoveryNode> discoveryNodes = new ArrayList<>();
// ExecutorService#invokeAll guarantees that the futures are returned in the iteration order of the tasks so we can associate the
// hostname with the corresponding task by iterating together
final Iterator<String> it = hosts.iterator();
- for (final Future<ResolvedHostname> future : futures) {
+ for (final Future<TransportAddress[]> future : futures) {
final String hostname = it.next();
- if (!future.isCancelled()) {
+ if (future.isCancelled()) {
+ logger.warn("timed out after [{}] resolving host [{}]", resolveTimeout, hostname);
+ } else {
+ assert future.isDone(); // guaranteed by the invokeAll
try {
- final ResolvedHostname resolvedHostname = future.get();
- if (resolvedHostname.isSuccess()) {
- logger.trace("resolved host [{}] to {}", hostname, resolvedHostname.addresses());
- for (final TransportAddress address : resolvedHostname.addresses()) {
- discoveryNodes.add(
- new DiscoveryNode(
- idGenerator.get(),
- address,
- emptyMap(),
- emptySet(),
- Version.CURRENT.minimumCompatibilityVersion()));
- }
- } else {
- final String message = "failed to resolve host [" + hostname + "]";
- logger.warn(message, resolvedHostname.failure());
+ final TransportAddress[] addresses = future.get();
+ logger.trace("resolved host [{}] to {}", hostname, addresses);
+ for (final TransportAddress address : addresses) {
+ discoveryNodes.add(
+ new DiscoveryNode(
+ idGenerator.get(),
+ address,
+ emptyMap(),
+ emptySet(),
+ Version.CURRENT.minimumCompatibilityVersion()));
}
} catch (final ExecutionException e) {
final String message = "failed to resolve host [" + hostname + "]";
logger.warn(message, e);
}
- } else {
- logger.warn("timed out after [{}] resolving host [{}]", resolveTimeout, hostname);
}
}
return discoveryNodes;
@@ -289,17 +285,11 @@ public class UnicastZenPing extends AbstractComponent implements ZenPing {
* @param limitPortCounts the port count limit
* @return a callable that can be used to submit to an executor service
*/
- private static Callable<ResolvedHostname> lookup(
+ private static Callable<TransportAddress[]> lookup(
final String host,
final TransportService transportService,
final int limitPortCounts) {
- return () -> {
- try {
- return ResolvedHostname.success(transportService.addressesFromString(host, limitPortCounts));
- } catch (final UnknownHostException e) {
- return ResolvedHostname.failure(e);
- }
- };
+ return () -> transportService.addressesFromString(host, limitPortCounts);
}
@Override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I pushed e2fc5a2.
|
|
||
| @Override | ||
| public TransportAddress[] addressesFromString(String address, int perAddressLimit) throws Exception { | ||
| public TransportAddress[] addressesFromString(String address, int perAddressLimit) throws UnknownHostException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| .put("cluster.name", "mismatch") | ||
| .put(TransportSettings.PORT.getKey(), startPort + "-" + endPort).build(); | ||
| // use ephemeral ports | ||
| final Settings settings = Settings.builder().put("cluster.name", "test").put(TransportSettings.PORT.getKey(), 0).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| } | ||
|
|
||
| @Override | ||
| public void ping(final PingListener listener, final TimeValue duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add some docs about what falls under duration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 156a49b.
This commit simplifies the handling of hostname lookups in UnicastZenPing, removing an unnecessary abstraction and, subsequently, an unnecessary method.
This commit sets the deafult resolve timeout on DNS lookups in UnicastZenPing to five seconds.
This commit adds Javadocs for UnicastZenPing#ping, especially clarifying the purpose of the duration parameter.
Today we eagerly resolve unicast hosts. This means that if DNS changes, we will never find the host at the new address. Moreover, a single host failng to resolve causes startup to abort. This commit introduces lazy resolution of unicast hosts. If a DNS entry changes, there is an opportunity for the host to be discovered. Note that under the Java security manager, there is a default positive cache of infinity for resolved hosts; this means that if a user does want to operate in an environment where DNS can change, they must adjust networkaddress.cache.ttl in their security policy. And if a host fails to resolve, we warn log the hostname but continue pinging other configured hosts. When doing DNS resolutions for unicast hostnames, we wait until the DNS lookups timeout. This appears to be forty-five seconds on modern JVMs, and it is not configurable. If we do these serially, the cluster can be blocked during ping for a lengthy period of time. This commit introduces doing the DNS lookups in parallel, and adds a user-configurable timeout for these lookups. Relates #21630
|
Thanks @bleskes. |
Today we eagerly resolve unicast hosts. This means that if DNS changes,
we will never find the host at the new address. Moreover, a single host
failng to resolve causes startup to abort. This commit introduces lazy
resolution of unicast hosts. If a DNS entry changes, there is an
opportunity for the host to be discovered. Note that under the Java
security manager, there is a default positive cache of infinity for
resolved hosts; this means that if a user does want to operate in an
environment where DNS can change, they must adjust
networkaddress.cache.ttl in their security policy. And if a host fails
to resolve, we warn log the hostname but continue pinging other
configured hosts.
When doing DNS resolutions for unicast hostnames, we wait until the DNS
lookups timeout. This appears to be forty-five seconds on modern JVMs,
and it is not configurable. If we do these serially, the cluster can be
blocked during ping for a lengthy period of time. This commit introduces
doing the DNS lookups in parallel on the generic thread pool, and adds a
user-configurable timeout for these lookups.
Closes #14441, closes #16412