Skip to content

Conversation

@cbuescher
Copy link
Member

@cbuescher cbuescher commented Mar 23, 2022

Currently this qa module runs integration tests that cover cross-cluster search
with both the local and the remote cluster on the same version. In order to also
cover cross-cluster communication across multiple versions, this change adds
additional test tasks that also start a remote cluster in the previous minor
version and run the same tests against this configuration.

Relates #84481

Currently this qa module runs integration tests that cover cross-cluster search
with both the local and the remote cluster on the same version. In order to also
cover cross-cluster communication across multiple versions, this change adds
additional test tasks that also start a remote cluster in the precious minor
version and run the same tests against this configuration.

Relates elastic#84481
@cbuescher cbuescher added >test Issues or PRs that are addressing/adding tests :Search/Search Search-related issues that do not fall into other categories v8.2.0 labels Mar 23, 2022
@cbuescher cbuescher requested a review from javanna March 23, 2022 14:46
@cbuescher cbuescher mentioned this pull request Mar 23, 2022
5 tasks
@cbuescher
Copy link
Member Author

@mark-vieira or @breskeby, can you also take a look if this looks more or less okay for now from the gradle script side? Its a bit verbose but then again you mentioned on the team channel that you can probably improve later if need be. Just want to make sure I'm not adding anything really wrong here.

@mark-vieira
Copy link
Contributor

@cbuescher could you clarify the range of versions we should be testing here, and which branches this is going to get ported to? As of right now it just goes back one minor, so 8.2 would test against 8.1 and 8.0 would test against 7.17. Do we plan on backporting this back to 8.0? Also, what is the BWC guarantee here? Are we only supporting one minor back or should we expand this testing to all wire compatible versions? Our testing should mirror what we support and it's not clear to me which combination of versions we are supporting here.

@cbuescher
Copy link
Member Author

@mark-vieira currently I think we only want to merge this to 8.2. The additional version to be tested is indeed the "first previous minor", which is our earliest supported version for cross cluster search functionality (see https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cross-cluster-search.html#ccs-supported-configurations). This support window is intentionally smaller than the general wire compatibility. I think it is enough if we test against one patch version of the previous minor, maybe @javanna disagrees though.

@mark-vieira
Copy link
Contributor

I think we have two directions of compatibility here:

One is essentially the same as wire compatibility. That is, you can have a local cluster be any wire compatible version of the remote. So in the 8.2 branch we would test local clusters of all version back to 7.17 with the remote cluster being 8.2.

The second is where the remote cluster is older than the local cluster, and for this we can only go back one minor version.

So what it sounds like is that we are just adding coverage here for the second case. Do we have existing coverage for the first case elsewhere?

@javanna
Copy link
Member

javanna commented Mar 28, 2022

@mark-vieira the goal is currently to test against the previous minor only. If we are on 8.2, we want to test searching on a 8.1 cluster through a 8.2 coordinator cluster. This is to cover for situations that have caused problems in the past, where we add a feature to 8.2 yet CCS does not work against the previous minor as the feature is not supported there. This will also implicitly cover for any wire change besides adding features, some of which we already have coverage for, while some others are only exercised by CCS (see #78472). That is indeed that is what you mentioned as the second case.

First case that tests forward compatibility could be interesting as well, we have no coverage for that, but we are currently focusing on backwards compatibility as it looks higher priority having no CCS tests across versions. We may look into forward compability as a follow-up.

@javanna javanna requested review from dnhatn and removed request for javanna March 28, 2022 14:43
@dnhatn dnhatn added the test-full-bwc Trigger full BWC version matrix tests label Mar 28, 2022
@dnhatn
Copy link
Member

dnhatn commented Mar 28, 2022

I understand the goal is to test against the previous minor only. But, I think it's better to expand this QA module to test against the remote cluster:

  • on the current version (the current behavior)
  • the the previous minor versions of the current major (8.0, 8.1)
  • on the last minor of the previous major (i.e., 7.17)

I tweaked the Gradle file to have this capacity. @cbuescher Let me know what you think. Thank you for working on this.

Gradle file
import org.elasticsearch.gradle.Version
import org.elasticsearch.gradle.internal.info.BuildParams
import org.elasticsearch.gradle.internal.test.RestIntegTestTask
import org.elasticsearch.gradle.testclusters.DefaultTestClustersTask

apply plugin: 'elasticsearch.internal-testclusters'
apply plugin: 'elasticsearch.standalone-rest-test'
apply plugin: 'elasticsearch.bwc-test'
apply plugin: 'elasticsearch.rest-resources'

dependencies {
  testImplementation project(":client:rest-high-level")
}

def ccsSupportedVersion = bwcVersion -> {
  def currentVersion = Version.fromString(project.version)
  return currentVersion.minor == 0 || (currentVersion.major == bwcVersion.major && currentVersion.minor - bwcVersion.minor <= 1)
}

BuildParams.bwcVersions.withWireCompatible(ccsSupportedVersion) { bwcVersion, baseName ->

  def remoteCluster = testClusters.register("${baseName}-remote") {
    numberOfNodes = 2
    versions = [bwcVersion.toString()]
  }

  def localCluster = testClusters.register("${baseName}-local") {
    numberOfNodes = 2
    versions = [project.version]
    setting 'cluster.remote.connections_per_cluster', '1'
    setting 'cluster.remote.leader_cluster.seeds',
      { "\"${remoteCluster.get().getAllTransportPortURI().get(0)}\"" }
  }

  tasks.register("${baseName}-start-remote", DefaultTestClustersTask.class) {
    useCluster remoteCluster
    dependsOn "processTestResources"
    doLast {
      "Starting remote cluster ${remoteCluster} before the local cluster"
    }
  }

  tasks.register(bwcTaskName(bwcVersion), RestIntegTestTask) {
    useCluster remoteCluster
    useCluster localCluster
    doFirst {
      nonInputProperties.systemProperty('tests.rest.cluster', localCluster.map(c -> c.allHttpSocketURI.join(",")))
      systemProperty 'tests.remote_cluster_version', bwcVersion.toString().replace('-SNAPSHOT', '')
      systemProperty 'tests.rest.suite', 'remote_cluster'
    }
    dependsOn "${baseName}-start-remote"
    mustRunAfter("precommit")
  }
}

testClusters.configureEach {
  setting 'node.roles', '[data,ingest,master]'
  setting 'xpack.security.enabled', 'false'
  requiresFeature 'es.index_mode_feature_flag_registered', Version.fromString("8.0.0")
}

@mark-vieira
Copy link
Contributor

@dnhatn Testing a remote cluster with all wire-compatible versions doesn't seem to be something we support according to these docs. In the main branch (8.2) we technically only support a remote cluster as old as 8.1 so what you've described is testing scenarios that are not technically supported.

@dnhatn
Copy link
Member

dnhatn commented Mar 28, 2022

@mark-vieira Thank you for your feedback. Yes, we should skip testing the unsupported minor versions of the current major. But we should test against 7.17.

@javanna
Copy link
Member

javanna commented Mar 28, 2022

Thank you for your feedback. Yes, we should skip testing the unsupported minor versions of the current major. But we should test against 7.17.

@nhat that is also not supported, although it will technically work today. We updated the docs recently around this to clarify expectations.

@dnhatn
Copy link
Member

dnhatn commented Mar 28, 2022

@javanna I am not sure if that change is too restricted. How do users perform a rolling upgrade with CCS from 7.17 or 8.0 to 8.2?

@dnhatn
Copy link
Member

dnhatn commented Mar 28, 2022

@javanna I talked to Julie and understood the motivation behind the change.

@dnhatn
Copy link
Member

dnhatn commented Mar 28, 2022

@mark-vieira @javanna Thanks for the feedback. If I understand correctly, we should test against 8.1.0, 8.1.1, 8.1.2, and 8.2.0 for the current version (8.2.0)?

@cbuescher
Copy link
Member Author

If I understand correctly, we should test against 8.1.0, 8.1.1, 8.1.2, and 8.2.0 for the current version (8.2.0)

Ideally probably yes, but I intentionally limited this to the first patch release of the previous minor because I didn't want to increase the running time for these tests without a good reason. I think if we change this to the "last" patch release of the previous minor branch (i.e. first test 8.2.0 (local) -> 8.1.0 (remote) once that is release, then 8.1.1. once that is out etc...) we would cover them all over time. The expectation here is that patch realeases don't differ that much (at least in CCS) and we buy us some reduced execution time with that. I'm also happy using all patch versions of the previous minor though if that is what most people feel we should do. I will update my PR shortly accordingly.

@cbuescher
Copy link
Member Author

I tweaked the Gradle file to have this capacity. @cbuescher Let me know what you think.

I tried this to learn more about how the tests are run. Unfortunately I cannot see how the yaml rest test are executed. I introduced some intentional failures to them and nothing failed. e.g. I don't undestand how e.g. the data on the remote cluster (the stuff in test/resources/rest-api-spec/test/remote_cluster) gets indexed. In the gradle file I have I undestand this is done via the RestIntegTestTask?

@dnhatn
Copy link
Member

dnhatn commented Mar 29, 2022

I intentionally limited this to the first patch release of the previous minor because I didn't want to increase the running time for these tests without a good reason.

Our CI already takes care of this. Regular pull requests (i.e., without test-full-bwc label) will test against snapshotBwcVersions.

@dnhatn
Copy link
Member

dnhatn commented Mar 29, 2022

I tried this to learn more about how the tests are run. Unfortunately I cannot see how the yaml rest test are executed.

@cbuescher Sorry, I misunderstood the setup. I've updated the gradle file. Would you mind giving it another try? I am happy to walk you through the setup.

@cbuescher cbuescher force-pushed the ccs-multi-cluster-tests branch from a6420b0 to b9e34ea Compare March 30, 2022 16:30
@cbuescher
Copy link
Member Author

cbuescher commented Mar 30, 2022

@dnhatn thanks for the example file and explaining the details of it. I wasn't aware of the .ci directory and the snapshotBwcVersions, if that is the subset we regularly test on PRs that should be fine. I think its okay if we run the full set of wire-compatible and ccs-compatible versions for the remote cluster only in full builds. Let me know what you think.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @cbuescher.

@cbuescher cbuescher merged commit fc38909 into elastic:master Mar 31, 2022
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Apr 21, 2022
Currently this qa module runs integration tests that cover cross-cluster search
with both the local and the remote cluster on the same version. In order to also
cover cross-cluster communication across multiple versions, this change adds
additional test tasks that also start a remote cluster in all wire-compatible 
previous minor version and run the same tests against this configuration.

Relates elastic#84481
cbuescher pushed a commit that referenced this pull request Apr 21, 2022
Currently this qa module runs integration tests that cover cross-cluster search
with both the local and the remote cluster on the same version. In order to also
cover cross-cluster communication across multiple versions, this change adds
additional test tasks that also start a remote cluster in all wire-compatible 
previous minor version and run the same tests against this configuration.

Relates #84481
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Search/Search Search-related issues that do not fall into other categories >test Issues or PRs that are addressing/adding tests test-full-bwc Trigger full BWC version matrix tests v8.2.1 v8.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants