-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ES|QL: Skip CASE function from InferIsNotNull rule checks #113123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
af3bfe6
8137841
715036e
b50ec87
ca7061e
aecd148
4bdd904
0191ccd
92b6a93
4facf83
b55b2eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| pr: 113123 | ||
| summary: "ES|QL: Skip CASE function from `InferIsNotNull` rule checks" | ||
| area: ES|QL | ||
| type: bug | ||
| issues: | ||
| - 112704 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,6 +15,7 @@ | |
| import org.elasticsearch.xpack.esql.core.expression.predicate.nulls.IsNotNull; | ||
| import org.elasticsearch.xpack.esql.core.rule.Rule; | ||
| import org.elasticsearch.xpack.esql.core.util.CollectionUtils; | ||
| import org.elasticsearch.xpack.esql.expression.function.scalar.conditional.Case; | ||
| import org.elasticsearch.xpack.esql.expression.function.scalar.nulls.Coalesce; | ||
| import org.elasticsearch.xpack.esql.plan.logical.LogicalPlan; | ||
|
|
||
|
|
@@ -24,8 +25,7 @@ | |
| import static java.util.Collections.emptySet; | ||
|
|
||
| /** | ||
| * Simplify IsNotNull targets by resolving the underlying expression to its root fields with unknown | ||
| * nullability. | ||
| * Simplify IsNotNull targets by resolving the underlying expression to its root fields. | ||
| * e.g. | ||
| * (x + 1) / 2 IS NOT NULL --> x IS NOT NULL AND (x+1) / 2 IS NOT NULL | ||
| * SUBSTRING(x, 3) > 4 IS NOT NULL --> x IS NOT NULL AND SUBSTRING(x, 3) > 4 IS NOT NULL | ||
|
|
@@ -85,7 +85,7 @@ protected Set<Expression> resolveExpressionAsRootAttributes(Expression exp, Attr | |
|
|
||
| private boolean doResolve(Expression exp, AttributeMap<Expression> aliases, Set<Expression> resolvedExpressions) { | ||
| boolean changed = false; | ||
| // check if the expression can be skipped or is not nullabe | ||
| // check if the expression can be skipped | ||
| if (skipExpression(exp)) { | ||
| resolvedExpressions.add(exp); | ||
| } else { | ||
|
|
@@ -106,6 +106,13 @@ private boolean doResolve(Expression exp, AttributeMap<Expression> aliases, Set< | |
| } | ||
|
|
||
| private static boolean skipExpression(Expression e) { | ||
| return e instanceof Coalesce; | ||
| // These two functions can have a complex set of expressions as arguments that can mess up the simplification we are trying to add. | ||
| // If there is a "case(f is null, null, ...) is not null" expression, | ||
| // assuming that "case(f is null.....) is not null AND f is not null" (what this rule is doing) is a wrong assumption because | ||
| // the "case" function will want both null "f" and not null "f". Doing it like this contradicts the condition inside case, so we | ||
| // must avoid these cases. | ||
| // We could be smarter and look inside "case" and "coalesce" to see if there is any comparison of fields with "null" but, | ||
| // the complexity is too high to warrant an attempt _now_. | ||
| return e instanceof Coalesce || e instanceof Case; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be nice to understand why these two functions are special, at least with a comment, and eventually extract a general rule (a flag interface or a method in the Expression class to identify the functions that fall into this category).
Can you please elaborate a bit on this?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After spending some time on this issue when it was initially reported, my thinking is the following:
I will add a comment in the PR for this method, @bpintea also mentioned this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't completely agree with this statement. These two functions mess up with this rule, not with
This is closer to the point IMHO: the real peculiarity is that these functions have a custom behavior with null inputs, ie. even if one input is null, the function can still be not null, so The danger comes from the fact that this is an implementation detail, peculiar to each function.
My concern is that a contributor could introduce a new function with a similar behavior, and will unintentionally break this rule.
|
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,6 +29,7 @@ | |
| import org.elasticsearch.xpack.esql.core.type.DataType; | ||
| import org.elasticsearch.xpack.esql.core.type.EsField; | ||
| import org.elasticsearch.xpack.esql.expression.function.EsqlFunctionRegistry; | ||
| import org.elasticsearch.xpack.esql.expression.function.scalar.conditional.Case; | ||
| import org.elasticsearch.xpack.esql.expression.function.scalar.nulls.Coalesce; | ||
| import org.elasticsearch.xpack.esql.expression.function.scalar.string.StartsWith; | ||
| import org.elasticsearch.xpack.esql.expression.predicate.operator.arithmetic.Add; | ||
|
|
@@ -58,8 +59,11 @@ | |
|
|
||
| import static java.util.Collections.emptyMap; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.L; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.ONE; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.TEST_SEARCH_STATS; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.TEST_VERIFIER; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.THREE; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.TWO; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.as; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.getFieldAttribute; | ||
| import static org.elasticsearch.xpack.esql.EsqlTestUtils.greaterThanOf; | ||
|
|
@@ -81,10 +85,6 @@ public class LocalLogicalPlanOptimizerTests extends ESTestCase { | |
| private static LogicalPlanOptimizer logicalOptimizer; | ||
| private static Map<String, EsField> mapping; | ||
|
|
||
| private static final Literal ONE = L(1); | ||
| private static final Literal TWO = L(2); | ||
| private static final Literal THREE = L(3); | ||
|
|
||
| @BeforeClass | ||
| public static void init() { | ||
| parser = new EsqlParser(); | ||
|
|
@@ -386,38 +386,6 @@ public void testMissingFieldInFilterNoProjection() { | |
| ); | ||
| } | ||
|
|
||
| public void testIsNotNullOnCoalesce() { | ||
| var plan = localPlan(""" | ||
| from test | ||
| | where coalesce(emp_no, salary) is not null | ||
| """); | ||
|
|
||
| var limit = as(plan, Limit.class); | ||
| var filter = as(limit.child(), Filter.class); | ||
| var inn = as(filter.condition(), IsNotNull.class); | ||
| var coalesce = as(inn.children().get(0), Coalesce.class); | ||
| assertThat(Expressions.names(coalesce.children()), contains("emp_no", "salary")); | ||
| var source = as(filter.child(), EsRelation.class); | ||
| } | ||
|
|
||
| public void testIsNotNullOnExpression() { | ||
| var plan = localPlan(""" | ||
| from test | ||
| | eval x = emp_no + 1 | ||
| | where x is not null | ||
| """); | ||
|
|
||
| var limit = as(plan, Limit.class); | ||
| var filter = as(limit.child(), Filter.class); | ||
| var inn = as(filter.condition(), IsNotNull.class); | ||
| assertThat(Expressions.names(inn.children()), contains("x")); | ||
| var eval = as(filter.child(), Eval.class); | ||
| filter = as(eval.child(), Filter.class); | ||
| inn = as(filter.condition(), IsNotNull.class); | ||
| assertThat(Expressions.names(inn.children()), contains("emp_no")); | ||
| var source = as(filter.child(), EsRelation.class); | ||
| } | ||
|
|
||
| public void testSparseDocument() throws Exception { | ||
| var query = """ | ||
| from large | ||
|
|
@@ -516,6 +484,66 @@ public void testIsNotNullOnFunctionWithTwoFields() { | |
| assertEquals(expected, new InferIsNotNull().apply(f)); | ||
| } | ||
|
|
||
| public void testIsNotNullOnCoalesce() { | ||
| var plan = localPlan(""" | ||
| from test | ||
| | where coalesce(emp_no, salary) is not null | ||
| """); | ||
|
|
||
| var limit = as(plan, Limit.class); | ||
| var filter = as(limit.child(), Filter.class); | ||
| var inn = as(filter.condition(), IsNotNull.class); | ||
| var coalesce = as(inn.children().get(0), Coalesce.class); | ||
| assertThat(Expressions.names(coalesce.children()), contains("emp_no", "salary")); | ||
| var source = as(filter.child(), EsRelation.class); | ||
| } | ||
|
|
||
| public void testIsNotNullOnExpression() { | ||
| var plan = localPlan(""" | ||
| from test | ||
| | eval x = emp_no + 1 | ||
| | where x is not null | ||
| """); | ||
|
|
||
| var limit = as(plan, Limit.class); | ||
| var filter = as(limit.child(), Filter.class); | ||
| var inn = as(filter.condition(), IsNotNull.class); | ||
| assertThat(Expressions.names(inn.children()), contains("x")); | ||
| var eval = as(filter.child(), Eval.class); | ||
| filter = as(eval.child(), Filter.class); | ||
| inn = as(filter.condition(), IsNotNull.class); | ||
| assertThat(Expressions.names(inn.children()), contains("emp_no")); | ||
| var source = as(filter.child(), EsRelation.class); | ||
| } | ||
|
Comment on lines
+487
to
+517
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (just checking: these are moved only, right?)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, sorry. I forgot to add a comment after I created the PR. |
||
|
|
||
| public void testIsNotNullOnCase() { | ||
| var plan = localPlan(""" | ||
| from test | ||
| | where case(emp_no > 10000, "1", salary < 50000, "2", first_name) is not null | ||
| """); | ||
|
|
||
| var limit = as(plan, Limit.class); | ||
| var filter = as(limit.child(), Filter.class); | ||
| var inn = as(filter.condition(), IsNotNull.class); | ||
| var caseF = as(inn.children().get(0), Case.class); | ||
| assertThat(Expressions.names(caseF.children()), contains("emp_no > 10000", "\"1\"", "salary < 50000", "\"2\"", "first_name")); | ||
| var source = as(filter.child(), EsRelation.class); | ||
| } | ||
|
|
||
| public void testIsNotNullOnCase_With_IS_NULL() { | ||
| var plan = localPlan(""" | ||
| from test | ||
| | where case(emp_no IS NULL, "1", salary IS NOT NULL, "2", first_name) is not null | ||
| """); | ||
|
|
||
| var limit = as(plan, Limit.class); | ||
| var filter = as(limit.child(), Filter.class); | ||
| var inn = as(filter.condition(), IsNotNull.class); | ||
| var caseF = as(inn.children().get(0), Case.class); | ||
| assertThat(Expressions.names(caseF.children()), contains("emp_no IS NULL", "\"1\"", "salary IS NOT NULL", "\"2\"", "first_name")); | ||
| var source = as(filter.child(), EsRelation.class); | ||
| } | ||
|
|
||
| private IsNotNull isNotNull(Expression field) { | ||
| return new IsNotNull(EMPTY, field); | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: in case any other change is required to the PR, maybe move the comment from the call site as function javadoc/comment.