-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Honor skip_unavailable setting for nonmatching indices errors at planning time #116348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Honor skip_unavailable setting for nonmatching indices errors at planning time #116348
Conversation
|
Hi @quux00, I've created a changelog YAML for you. |
4b574b4 to
d8bba72
Compare
.../src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/CrossClustersEnrichIT.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSessionCCSUtils.java
Outdated
Show resolved
Hide resolved
8770e2c to
b642e6d
Compare
…w new rules; tests updated
…des in EsIndex to always have the list of indices resolved by field-caps because when there are no mappings found in that index we pass in an empty map (which seems wrong). I tried to stop doing that but soon hit layers of ESQL I don't understand, so tried the simplier approach of adding a new (somewhat redundant) new field to IndexResolver: Set<String> resolvedIndices. This solves the problem of allowing the CCS handler code in EsqlSession.preAnalyze (which calls EsqlSessionCCSUtils.updateExecutionInfoWithClustersWithNoMatchingIndices) to determine whether or not a cluster had no matching indices from the field-caps call, allowing either an error to be thrown or for the CCS metadata to be updated.
b642e6d to
33cd137
Compare
…lable-missing-indices-t2
…nmatching indices tests
…lable-missing-indices-t2
…ssage is returned Adjusted RemoteClusterSecurityEsqlIT so it passes for skip_unavailable=true, although the error message returned for skip_unavailable=false is still wrong.
… policies for skip_un=true clusters; requires modifying the Analyzer
…lable-missing-indices-t2
|
|
||
| public static IndexResolution valid(EsIndex index) { | ||
| return valid(index, Collections.emptyMap()); | ||
| public static IndexResolution valid(EsIndex index, Set<String> resolvedIndices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: to avoid the noise in the PR, keep the old method in place and extract the name from EsIndex:
valid(EsIndex index) { return valid(index, singletonSet(index.name())}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Except we should use index.concreteIndices(), not index name, since the latter is the original unresolved user requested index expression. Fixed in next push.
| /** | ||
| * @return all indices found by field-caps (regardless of whether they had any mappings) | ||
| */ | ||
| public Set<String> getResolvedIndices() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since there's no setter, you can drop the get -> resolvedIndices()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Fixed in next push. I also changed getUnavailableClusters() to unavailableClusters().
| return Objects.hash(index, invalid, resolvedIndices, unavailableClusters); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to have the resolved indices in toString()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Updated in next push.
costin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor drive-by comments.
…lable-missing-indices-t2
…resolvedIndices) to avoid PR noise, since it can be inferred from EsIndex contents
…lable-missing-indices-t2
|
@elasticsearchmachine run elasticsearch-ci/part-1 |
…lable-missing-indices-t2
|
Hi @quux00, I've updated the changelog YAML for you. |
…reak, so reverting original model for invalid case
…lable-missing-indices-t2
💚 Backport successful
|
…t planning time (elastic#116348) Adds support to ES|QL planning time (EsqlSession code) for dealing with non-matching indices and how that relates to the remote cluster skip_unavailable setting and also how to deal with missing indices on the local cluster (if included in the query). For clusters included in an ES|QL query: • `skip_unavailable=true` means: if no data is returned from that cluster (due to cluster-not-connected, no matching indices, a missing concrete index or shard failures during searches), it is not a "fatal" error that causes the entire query to fail. Instead it is just a failure on that particular cluster and partial data should be returned from other clusters. • `skip_unavailable=false` means: if no data is returned from that cluster (for the same reasons enumerated above), then the whole query should fail. This allows users to ensure that data is returned from a "required" cluster. • For the local cluster, ES|QL assumes `allow_no_indices=true` and the skip_unavailable setting does not apply (in part because there is no way for a user to set skip_unavailable for the local cluster) Based on discussions with ES|QL team members, we defined the following rules to be enforced with respect to non-matching index expressions: **Rules enforced at planning time** P1. fail the query if there are no matching indices on any cluster (VerificationException) P2. fail the query if a skip_unavailable:false cluster has no matching indices (the local cluster already has this rule enforced at planning time) P3. fail query if the local cluster has no matching indices and a concrete index was specified **Rules enforced at execution time** For missing concrete (no wildcards present) index expressions: E1. fail the query when it was specified for the local cluster or a skip_unavailable=false remote cluster E2: on skip_unavailable=true clusters: an error fails the query on that cluster, but not the entire query (data from other clusters still returned) **Notes on the rules** P1: this already happens, no new code needed in this PR P2: The reason we need to enforce rule 2 at planning time is that when there are no matching indices from field caps the EsIndex that is created (and passed into IndexResolution.valid) leaves that cluster out of the list, so at execution time it will not attempt to query that cluster at all, so execution time will not catch missing concrete indices. And even if it did get queried at execution time it wouldn't fail on wildcard only indices where none of them matched. P3: Right now `FROM remote:existent,nomatch` does NOT throw a failure (for same reason described in rule 2 above) so that needs to be enforced in this PR. This PR deals with enforcing and testing the planning time rules: P1, P2 and P3. A follow-on PR will address changes needed for handling the execution time rules. **Notes on PR scope** This PR covers nonsecured clusters (`xpack.security.enabled: false`) and security using certs ("RCS1). In my testing I've founding that api-key based security ("RCS2") is not behaving the same, so that work has been deferred to a follow-on PR. Partially addresses elastic#114531
…t planning time (#116348) (#116824) Adds support to ES|QL planning time (EsqlSession code) for dealing with non-matching indices and how that relates to the remote cluster skip_unavailable setting and also how to deal with missing indices on the local cluster (if included in the query). For clusters included in an ES|QL query: • `skip_unavailable=true` means: if no data is returned from that cluster (due to cluster-not-connected, no matching indices, a missing concrete index or shard failures during searches), it is not a "fatal" error that causes the entire query to fail. Instead it is just a failure on that particular cluster and partial data should be returned from other clusters. • `skip_unavailable=false` means: if no data is returned from that cluster (for the same reasons enumerated above), then the whole query should fail. This allows users to ensure that data is returned from a "required" cluster. • For the local cluster, ES|QL assumes `allow_no_indices=true` and the skip_unavailable setting does not apply (in part because there is no way for a user to set skip_unavailable for the local cluster) Based on discussions with ES|QL team members, we defined the following rules to be enforced with respect to non-matching index expressions: **Rules enforced at planning time** P1. fail the query if there are no matching indices on any cluster (VerificationException) P2. fail the query if a skip_unavailable:false cluster has no matching indices (the local cluster already has this rule enforced at planning time) P3. fail query if the local cluster has no matching indices and a concrete index was specified **Rules enforced at execution time** For missing concrete (no wildcards present) index expressions: E1. fail the query when it was specified for the local cluster or a skip_unavailable=false remote cluster E2: on skip_unavailable=true clusters: an error fails the query on that cluster, but not the entire query (data from other clusters still returned) **Notes on the rules** P1: this already happens, no new code needed in this PR P2: The reason we need to enforce rule 2 at planning time is that when there are no matching indices from field caps the EsIndex that is created (and passed into IndexResolution.valid) leaves that cluster out of the list, so at execution time it will not attempt to query that cluster at all, so execution time will not catch missing concrete indices. And even if it did get queried at execution time it wouldn't fail on wildcard only indices where none of them matched. P3: Right now `FROM remote:existent,nomatch` does NOT throw a failure (for same reason described in rule 2 above) so that needs to be enforced in this PR. This PR deals with enforcing and testing the planning time rules: P1, P2 and P3. A follow-on PR will address changes needed for handling the execution time rules. **Notes on PR scope** This PR covers nonsecured clusters (`xpack.security.enabled: false`) and security using certs ("RCS1). In my testing I've founding that api-key based security ("RCS2") is not behaving the same, so that work has been deferred to a follow-on PR. Partially addresses #114531
…t planning time (elastic#116348) Adds support to ES|QL planning time (EsqlSession code) for dealing with non-matching indices and how that relates to the remote cluster skip_unavailable setting and also how to deal with missing indices on the local cluster (if included in the query). For clusters included in an ES|QL query: • `skip_unavailable=true` means: if no data is returned from that cluster (due to cluster-not-connected, no matching indices, a missing concrete index or shard failures during searches), it is not a "fatal" error that causes the entire query to fail. Instead it is just a failure on that particular cluster and partial data should be returned from other clusters. • `skip_unavailable=false` means: if no data is returned from that cluster (for the same reasons enumerated above), then the whole query should fail. This allows users to ensure that data is returned from a "required" cluster. • For the local cluster, ES|QL assumes `allow_no_indices=true` and the skip_unavailable setting does not apply (in part because there is no way for a user to set skip_unavailable for the local cluster) Based on discussions with ES|QL team members, we defined the following rules to be enforced with respect to non-matching index expressions: **Rules enforced at planning time** P1. fail the query if there are no matching indices on any cluster (VerificationException) P2. fail the query if a skip_unavailable:false cluster has no matching indices (the local cluster already has this rule enforced at planning time) P3. fail query if the local cluster has no matching indices and a concrete index was specified **Rules enforced at execution time** For missing concrete (no wildcards present) index expressions: E1. fail the query when it was specified for the local cluster or a skip_unavailable=false remote cluster E2: on skip_unavailable=true clusters: an error fails the query on that cluster, but not the entire query (data from other clusters still returned) **Notes on the rules** P1: this already happens, no new code needed in this PR P2: The reason we need to enforce rule 2 at planning time is that when there are no matching indices from field caps the EsIndex that is created (and passed into IndexResolution.valid) leaves that cluster out of the list, so at execution time it will not attempt to query that cluster at all, so execution time will not catch missing concrete indices. And even if it did get queried at execution time it wouldn't fail on wildcard only indices where none of them matched. P3: Right now `FROM remote:existent,nomatch` does NOT throw a failure (for same reason described in rule 2 above) so that needs to be enforced in this PR. This PR deals with enforcing and testing the planning time rules: P1, P2 and P3. A follow-on PR will address changes needed for handling the execution time rules. **Notes on PR scope** This PR covers nonsecured clusters (`xpack.security.enabled: false`) and security using certs ("RCS1). In my testing I've founding that api-key based security ("RCS2") is not behaving the same, so that work has been deferred to a follow-on PR. Partially addresses elastic#114531
…t planning time (elastic#116348) Adds support to ES|QL planning time (EsqlSession code) for dealing with non-matching indices and how that relates to the remote cluster skip_unavailable setting and also how to deal with missing indices on the local cluster (if included in the query). For clusters included in an ES|QL query: • `skip_unavailable=true` means: if no data is returned from that cluster (due to cluster-not-connected, no matching indices, a missing concrete index or shard failures during searches), it is not a "fatal" error that causes the entire query to fail. Instead it is just a failure on that particular cluster and partial data should be returned from other clusters. • `skip_unavailable=false` means: if no data is returned from that cluster (for the same reasons enumerated above), then the whole query should fail. This allows users to ensure that data is returned from a "required" cluster. • For the local cluster, ES|QL assumes `allow_no_indices=true` and the skip_unavailable setting does not apply (in part because there is no way for a user to set skip_unavailable for the local cluster) Based on discussions with ES|QL team members, we defined the following rules to be enforced with respect to non-matching index expressions: **Rules enforced at planning time** P1. fail the query if there are no matching indices on any cluster (VerificationException) P2. fail the query if a skip_unavailable:false cluster has no matching indices (the local cluster already has this rule enforced at planning time) P3. fail query if the local cluster has no matching indices and a concrete index was specified **Rules enforced at execution time** For missing concrete (no wildcards present) index expressions: E1. fail the query when it was specified for the local cluster or a skip_unavailable=false remote cluster E2: on skip_unavailable=true clusters: an error fails the query on that cluster, but not the entire query (data from other clusters still returned) **Notes on the rules** P1: this already happens, no new code needed in this PR P2: The reason we need to enforce rule 2 at planning time is that when there are no matching indices from field caps the EsIndex that is created (and passed into IndexResolution.valid) leaves that cluster out of the list, so at execution time it will not attempt to query that cluster at all, so execution time will not catch missing concrete indices. And even if it did get queried at execution time it wouldn't fail on wildcard only indices where none of them matched. P3: Right now `FROM remote:existent,nomatch` does NOT throw a failure (for same reason described in rule 2 above) so that needs to be enforced in this PR. This PR deals with enforcing and testing the planning time rules: P1, P2 and P3. A follow-on PR will address changes needed for handling the execution time rules. **Notes on PR scope** This PR covers nonsecured clusters (`xpack.security.enabled: false`) and security using certs ("RCS1). In my testing I've founding that api-key based security ("RCS2") is not behaving the same, so that work has been deferred to a follow-on PR. Partially addresses elastic#114531
…time other than disconnected exceptions (#120236) For ES|QL, we are moving to limit the scope of the skip_unavailable setting for remote clusters. Going forward, skip_unavailable will be considered for two scenarios: 1) inability to connect to a remote cluster ("unavailable") 2) whether to fail on execution time errors or not (inline with the upcoming allow_partial_search_results work for ES|QL). This PR reverses the special plan-time handling for skip_unavailable=true clusters that was added in #116348. Remote clusters, regardless of their skip_unavailable setting, will now use the same logic as the local cluster for index expression analysis at plan time, namely: 1) If any concrete index specified is missing from the cluster, a VerificationException will be thrown 2) If no matching index/alias/datastream was found on any cluster (even if all were specified with a wildcard), a VerificationException will be thrown Thus, we no longer require at least one matching index expression for skip_unavailable=false clusters either, as was done in the previous PR referenced above.
…time other than disconnected exceptions (elastic#120236) For ES|QL, we are moving to limit the scope of the skip_unavailable setting for remote clusters. Going forward, skip_unavailable will be considered for two scenarios: 1) inability to connect to a remote cluster ("unavailable") 2) whether to fail on execution time errors or not (inline with the upcoming allow_partial_search_results work for ES|QL). This PR reverses the special plan-time handling for skip_unavailable=true clusters that was added in elastic#116348. Remote clusters, regardless of their skip_unavailable setting, will now use the same logic as the local cluster for index expression analysis at plan time, namely: 1) If any concrete index specified is missing from the cluster, a VerificationException will be thrown 2) If no matching index/alias/datastream was found on any cluster (even if all were specified with a wildcard), a VerificationException will be thrown Thus, we no longer require at least one matching index expression for skip_unavailable=false clusters either, as was done in the previous PR referenced above.
…time other than disconnected exceptions (#120236) (#120628) For ES|QL, we are moving to limit the scope of the skip_unavailable setting for remote clusters. Going forward, skip_unavailable will be considered for two scenarios: 1) inability to connect to a remote cluster ("unavailable") 2) whether to fail on execution time errors or not (inline with the upcoming allow_partial_search_results work for ES|QL). This PR reverses the special plan-time handling for skip_unavailable=true clusters that was added in #116348. Remote clusters, regardless of their skip_unavailable setting, will now use the same logic as the local cluster for index expression analysis at plan time, namely: 1) If any concrete index specified is missing from the cluster, a VerificationException will be thrown 2) If no matching index/alias/datastream was found on any cluster (even if all were specified with a wildcard), a VerificationException will be thrown Thus, we no longer require at least one matching index expression for skip_unavailable=false clusters either, as was done in the previous PR referenced above.
Adds support to ES|QL planning time (EsqlSession code) for dealing with non-matching indices and
how that relates to the remote cluster skip_unavailable setting and also how to deal with missing
indices on the local cluster (if included in the query).
For clusters included in an ES|QL query:
•
skip_unavailable=truemeans:if no data is returned from that cluster (due to cluster-not-connected, no matching indices, a missing
concrete index or shard failures during searches), it is not a "fatal" error that causes the entire query to fail.
Instead it is just a failure on that particular cluster and partial data should be returned from other clusters.
•
skip_unavailable=falsemeans:if no data is returned from that cluster (for the same reasons enumerated above), then the whole query
should fail. This allows users to ensure that data is returned from a "required" cluster.
• For the local cluster, ES|QL assumes
allow_no_indices=trueand the skip_unavailable setting does not apply(in part because there is no way for a user to set skip_unavailable for the local cluster)
Based on discussions with ES|QL team members, we defined the following rules to be enforced with respect to non-matching index expressions:
Rules enforced at planning time
P1. fail the query if there are no matching indices on any cluster (VerificationException)
P2. fail the query if a skip_unavailable:false cluster has no matching indices (the local cluster already has this rule enforced at planning time)
P3. fail query if the local cluster has no matching indices and a concrete index was specified
Rules enforced at execution time
For missing concrete (no wildcards present) index expressions:
E1. fail the query when it was specified for the local cluster or a skip_unavailable=false remote cluster
E2: on skip_unavailable=true clusters: an error fails the query on that cluster, but not the entire query
(data from other clusters still returned)
Notes on the rules
P1: this already happens, no new code needed in this PR
P2: The reason we need to enforce rule 2 at planning time is that when there are no matching indices from
field caps the EsIndex that is created (and passed into IndexResolution.valid) leaves that cluster out of the
list, so at execution time it will not attempt to query that cluster at all, so execution time will not catch
missing concrete indices. And even if it did get queried at execution time it wouldn't fail on wildcard only
indices where none of them matched.
P3: Right now
FROM remote:existent,nomatchdoes NOT throw a failure (for same reason described in rule 2 above)so that needs to be enforced in this PR.
This PR deals with enforcing and testing the planning time rules: P1, P2 and P3. A follow-on PR will address changes needed for handling the execution time rules.
Notes on PR scope
This PR covers nonsecured clusters (
xpack.security.enabled: false) and security using certs ("RCS1).In my testing I've founding that api-key based security ("RCS2") is not behaving as expected, so that
work has been deferred to a follow-on PR.
Partially addresses #114531