-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Nested expressions inside stats command #104387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java
Outdated
Show resolved
Hide resolved
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java
Outdated
Show resolved
Hide resolved
|
Pinging @elastic/es-analytics-geo (Team:Analytics) |
|
Hi @costin, I've created a changelog YAML for you. |
4170030 to
87c5640
Compare
Allow nested expressions to be used both for grouping or inside aggregate functions inside the stats command. As such the grammar has been tweaked to allow the stats group to have optional aliasing. Fix elastic#99828
87c5640 to
a69bf9b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a small overlook while parsing the groupings, the fix is simple, but the results of the queries is highly impacted.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/ExpressionBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general.
There is only a small regression in the validation process:
row a = 1, b = 2 | stats max(max(a)) by b
in main branch returns
{
"error": {
"root_cause": [
{
"type": "verification_exception",
"reason": "Found 1 problem\nline 1:26: aggregate function's field must be an attribute or literal; found [max(a)] of type [Max]"
}
],
"type": "verification_exception",
"reason": "Found 1 problem\nline 1:26: aggregate function's field must be an attribute or literal; found [max(a)] of type [Max]"
},
"status": 400
}
while with this PR returns
{
"error": {
"root_cause": [
{
"type": "ql_illegal_argument_exception",
"reason": "Unsupported expression [max(a)]"
}
],
"type": "ql_illegal_argument_exception",
"reason": "Unsupported expression [max(a)]"
},
"status": 500
}
It's a different error message, but most importantly, it's a different error code (500 vs 400).
Even though the root problem is not strictly related to this PR (it depends on the fact that EVAL command does not properly validate the usage of aggregate functions), IMHO it's worth fixing now, both because of the regression and because it's a 500 error.
Notice that the same problem happens with the following queries:
row a = 1, b = 2 | stats max(a) by max(b)
row a = 1, b = 2 | eval x = max(a)
|
|
||
| grouping | ||
| : qualifiedName (COMMA qualifiedName)* | ||
| : INLINESTATS stats=fields (BY grouping=fields)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR: this was never implemented, we should remove it probably.
|
@luigi, trying to understand your note:
Do you mean we should make it clearer that the outer
Wondering too why we didn't verify aggs in EVAL. But is this issue the same as how we verify the arguments of aggs themselves? |
@bpintea yeah, I removed that part of the comment already, I realized the message is actually good enough in the context of aggs validation.
until now they were two different problems, but they could become the same if we rely on EVAL checks after expression extraction from STATS. |
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
I've added the validation for eval and stats; the previous validation was checking for something else and caught the case of aggs inside aggs. |
astefan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
3e4c504 to
add8e44
Compare
|
/cc @abdonpijpelink please update the stats docs with the expanded grammar - thanks! |
## Summary Add support for the validation and autocomplete engine for the feature introduced in elastic/elasticsearch#104387 The validation engine should not mark the syntax as invalid now. The autocomplete changes are a little bit more subtle: * after the `by` option command a `varX` and functions are now suggested in additions to `[columns]` * when typing an expression within the `by` scope the autocomplete should now understand and help with that, without promoting it when not needed.  ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: Kibana Machine <[email protected]>
## Summary Add support for the validation and autocomplete engine for the feature introduced in elastic/elasticsearch#104387 The validation engine should not mark the syntax as invalid now. The autocomplete changes are a little bit more subtle: * after the `by` option command a `varX` and functions are now suggested in additions to `[columns]` * when typing an expression within the `by` scope the autocomplete should now understand and help with that, without promoting it when not needed.  ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: Kibana Machine <[email protected]>
Allow nested expressions to be used both for grouping or inside
aggregate functions inside the stats command.
This allows the following stats command:
stats c = count( a / 2 ) by x + 1
Since the grouping now can be an expression, the grammar has
been tweaked to allow the stats group to have optional aliasing
similar to aggregate functions
stats c = count( a / 2 ) by z = b + 1
Fix #99828