Skip to content

Conversation

@DaveCTurner
Copy link
Contributor

The PeerFinder, introduced in #32246, needs to be able to identify, and
connect to, a remote master node using only its TransportAddress. This can be
done by opening a single-channel connection to the address, performing a
handshake, and only then forming a full-blown connection to the node. This
change implements this logic.

The `PeerFinder`, introduced in elastic#32246, needs to be able to identify, and
connect to, a remote master node using only its `TransportAddress`. This can be
done by opening a single-channel connection to the address, performing a
handshake, and only then forming a full-blown connection to the node. This
change implements this logic.
@DaveCTurner DaveCTurner added >enhancement v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Aug 6, 2018
@DaveCTurner DaveCTurner requested a review from ywelsch August 6, 2018 15:05
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Override
protected void doRun() throws Exception {

// TODO if transportService is already connected to this address then skip the handshaking
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbrooks8 @ywelsch I think you discussed this area recently. Arguably it might make sense for some (or more) of this functionality to move elsewhere. Perhaps within TransportService itself as it currently stands although I know that you've plans for this area so that might change in future.

@ywelsch ywelsch mentioned this pull request Aug 6, 2018
61 tasks
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, looks good otherwise.

IOUtils.closeWhileHandlingException(connection);
}

// NOMERGE better exceptions for failure cases?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use ConnectTransportException? We use that one e.g. when application-level handshake fails (see connectToNode).

}

void assertFailure() throws InterruptedException {
assertTrue(completionLatch.await(1, TimeUnit.SECONDS));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is a little optimistic for our slow CI machines. 30 secs?
Same comment for other places in this class.

.put(NODE_NAME_SETTING.getKey(), "node")
.put(CLUSTER_NAME_SETTING.getKey(), "local-cluster")
.build();
threadPool = new ThreadPool(settings);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use TestThreadPool instead?

@After
public void stopServices() {
transportService.stop();
threadPool.shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use terminate(threadPool); (this method is in ESTestCase)

@DaveCTurner
Copy link
Contributor Author

Failure is #32215 not this PR. @elasticmachine retest this please.

@DaveCTurner DaveCTurner merged commit 289e34a into elastic:zen2 Aug 7, 2018
@DaveCTurner DaveCTurner deleted the 2018-08-06-HandshakingTransportAddressConnector branch August 7, 2018 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants