Fix failures in SessionFactoryLoadBalancingTests #39154

jaymode · 2019-02-19T21:56:28Z

This change aims to fix failures in the session factory load balancing
tests that mock failure scenarios. For these tests, we randomly shut
down ldap servers and bind a client socket to the port they were
listening on. Unfortunately, we would occasionally encounter failures
in these tests where a socket was already in use and/or the port
we expected to connect to was wrong and in fact was to one of the ldap
instances that should have been shut down.

The failures are caused by the behavior of certain operating systems
when it comes to binding ports and wildcard addresses. It is possible
for a separate application to be bound to a wildcard address and still
allow our code to bind to that port on a specific address. So when we
close the server socket and open the client socket, we are still able
to establish a connection since the other application is already
listening on that port on a wildcard address. Another variant is that
the os will allow a wildcard bind of a server socket when there is
already an application listening on that port for a specific address.

In order to do our best to prevent failures in these scenarios, this
change does the following:

Binds a client socket to all addresses in an awaitBusy
Adds assumption that we could bind all valid addresses
In the case that we still establish a connection to an address that
we should not be able to, try to bind and expect a failure of not
being connected

Closes #32190

This change aims to fix failures in the session factory load balancing tests that mock failure scenarios. For these tests, we randomly shut down ldap servers and bind a client socket to the port they were listening on. Unfortunately, we would occasionally encounter failures in these tests where a socket was already in use and/or the port we expected to connect to was wrong and in fact was to one of the ldap instances that should have been shut down. The failures are caused by the behavior of certain operating systems when it comes to binding ports and wildcard addresses. It is possible for a separate application to be bound to a wildcard address and still allow our code to bind to that port on a specific address. So when we close the server socket and open the client socket, we are still able to establish a connection since the other application is already listening on that port on a wildcard address. Another variant is that the os will allow a wildcard bind of a server socket when there is already an application listening on that port for a specific address. In order to do our best to prevent failures in these scenarios, this change does the following: 1. Binds a client socket to all addresses in an awaitBusy 2. Adds assumption that we could bind all valid addresses 3. In the case that we still establish a connection to an address that we should not be able to, try to bind and expect a failure of not being connected Closes elastic#32190

elasticmachine · 2019-02-19T21:56:30Z

Pinging @elastic/es-security

bizybot

Overall LGTM, should do the trick. I have a few comments and suggestions. Thank you for tackling this.

bizybot · 2019-02-19T23:54:25Z

...security/src/test/java/org/elasticsearch/xpack/security/authc/ldap/support/LdapTestCase.java

                SSLServerSocketFactory serverSocketFactory = context.getServerSocketFactory();
                SSLSocketFactory clientSocketFactory = context.getSocketFactory();
-                listeners.add(InMemoryListenerConfig.createLDAPSConfig("ldaps", null, 0, serverSocketFactory, clientSocketFactory));
+                listeners.add(InMemoryListenerConfig.createLDAPSConfig("ldaps", InetAddress.getLoopbackAddress(), 0,


Any particular reason to not listen on all address on all interfaces, we are explicitly passing loopback address here.

bizybot · 2019-02-20T01:00:23Z

...va/org/elasticsearch/xpack/security/authc/ldap/support/SessionFactoryLoadBalancingTests.java

        final CountDownLatch closeLatch = new CountDownLatch(1);
        try {
+            final AtomicBoolean success = new AtomicBoolean(true);
+            final List<Socket> openMockSockets = Collections.synchronizedList(new ArrayList<>());


I think you wanted to use this, later on, to close them at the end of the test, currently, it is not being used other than collecting the sockets.

We don't need this, so I removed it

bizybot · 2019-02-20T01:01:58Z

...va/org/elasticsearch/xpack/security/authc/ldap/support/SessionFactoryLoadBalancingTests.java

+        final List<Thread> listenThreads = new ArrayList<>();
+        final CountDownLatch latch = new CountDownLatch(ldapServersToKill.size());
+        final CountDownLatch closeLatch = new CountDownLatch(1);
+        final List<Socket> openMockSockets = Collections.synchronizedList(new ArrayList<>());


same here, this can be used at the end of test to close the open sockets.

bizybot · 2019-02-20T01:22:57Z

...va/org/elasticsearch/xpack/security/authc/ldap/support/SessionFactoryLoadBalancingTests.java

                // of the ldap server and the opening of the socket
-                logger.debug("opening mock server socket listening on [{}]", port);
+                logger.debug("opening mock client sockets bound to [{}]", port);
                Runnable runnable = () -> {


may be extract this runnable code into a class PortBlocker or some other name?

This change aims to fix failures in the session factory load balancing tests that mock failure scenarios. For these tests, we randomly shut down ldap servers and bind a client socket to the port they were listening on. Unfortunately, we would occasionally encounter failures in these tests where a socket was already in use and/or the port we expected to connect to was wrong and in fact was to one of the ldap instances that should have been shut down. The failures are caused by the behavior of certain operating systems when it comes to binding ports and wildcard addresses. It is possible for a separate application to be bound to a wildcard address and still allow our code to bind to that port on a specific address. So when we close the server socket and open the client socket, we are still able to establish a connection since the other application is already listening on that port on a wildcard address. Another variant is that the os will allow a wildcard bind of a server socket when there is already an application listening on that port for a specific address. In order to do our best to prevent failures in these scenarios, this change does the following: 1. Binds a client socket to all addresses in an awaitBusy 2. Adds assumption that we could bind all valid addresses 3. In the case that we still establish a connection to an address that we should not be able to, try to bind and expect a failure of not being connected Closes #32190

This change aims to fix failures in the session factory load balancing tests that mock failure scenarios. For these tests, we randomly shut down ldap servers and bind a client socket to the port they were listening on. Unfortunately, we would occasionally encounter failures in these tests where a socket was already in use and/or the port we expected to connect to was wrong and in fact was to one of the ldap instances that should have been shut down. The failures are caused by the behavior of certain operating systems when it comes to binding ports and wildcard addresses. It is possible for a separate application to be bound to a wildcard address and still allow our code to bind to that port on a specific address. So when we close the server socket and open the client socket, we are still able to establish a connection since the other application is already listening on that port on a wildcard address. Another variant is that the os will allow a wildcard bind of a server socket when there is already an application listening on that port for a specific address. In order to do our best to prevent failures in these scenarios, this change does the following: 1. Binds a client socket to all addresses in an awaitBusy 2. Adds assumption that we could bind all valid addresses 3. In the case that we still establish a connection to an address that we should not be able to, try to bind and expect a failure of not being connected Closes elastic#32190

jaymode added >test Issues or PRs that are addressing/adding tests v7.0.0 :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) v6.7.0 v8.0.0 v7.2.0 labels Feb 19, 2019

jaymode requested a review from bizybot February 19, 2019 21:56

bizybot reviewed Feb 20, 2019

View reviewed changes

address review feedback

03cd8f9

jaymode merged commit bab7a4b into elastic:master Feb 20, 2019

jaymode deleted the load_balance_tests_fix branch February 20, 2019 18:37

michaelbaamonde added v7.0.0-rc1 and removed v7.0.0 labels Mar 25, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix failures in SessionFactoryLoadBalancingTests #39154

Fix failures in SessionFactoryLoadBalancingTests #39154

Uh oh!

jaymode commented Feb 19, 2019

Uh oh!

elasticmachine commented Feb 19, 2019

Uh oh!

bizybot left a comment

Uh oh!

bizybot Feb 19, 2019

Uh oh!

jaymode Feb 20, 2019

Uh oh!

bizybot Feb 20, 2019

Uh oh!

jaymode Feb 20, 2019

Uh oh!

bizybot Feb 20, 2019

Uh oh!

bizybot Feb 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix failures in SessionFactoryLoadBalancingTests #39154

Fix failures in SessionFactoryLoadBalancingTests #39154

Uh oh!

Conversation

jaymode commented Feb 19, 2019

Uh oh!

elasticmachine commented Feb 19, 2019

Uh oh!

bizybot left a comment

Choose a reason for hiding this comment

Uh oh!

bizybot Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

jaymode Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

bizybot Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

jaymode Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

bizybot Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

bizybot Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants