Skip to content

Conversation

@costin
Copy link
Member

@costin costin commented Jan 16, 2024

Allow nested expressions to be used both for grouping or inside
aggregate functions inside the stats command.
This allows the following stats command:

stats c = count( a / 2 ) by x + 1

Since the grouping now can be an expression, the grammar has
been tweaked to allow the stats group to have optional aliasing
similar to aggregate functions

stats c = count( a / 2 ) by z = b + 1

Fix #99828

@costin costin added :Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI labels Jan 16, 2024
@costin costin requested a review from astefan January 16, 2024 06:56
@costin costin marked this pull request as ready for review January 17, 2024 02:31
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @costin, I've created a changelog YAML for you.

@costin costin requested review from a team as code owners January 17, 2024 03:28
@costin costin force-pushed the esql/synthetic-evals branch 2 times, most recently from 4170030 to 87c5640 Compare January 17, 2024 06:26
costin and others added 5 commits January 16, 2024 23:06
Allow nested expressions to be used both for grouping or inside
 aggregate functions inside the stats command.
As such the grammar has been tweaked to allow the stats group to have
 optional aliasing.

Fix elastic#99828
@costin costin force-pushed the esql/synthetic-evals branch from 87c5640 to a69bf9b Compare January 17, 2024 07:07
@costin costin requested review from bpintea and removed request for a team January 17, 2024 07:07
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a small overlook while parsing the groupings, the fix is simple, but the results of the queries is highly impacted.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general.

There is only a small regression in the validation process:

row a = 1, b = 2 | stats max(max(a)) by b

in main branch returns

{
    "error": {
        "root_cause": [
            {
                "type": "verification_exception",
                "reason": "Found 1 problem\nline 1:26: aggregate function's field must be an attribute or literal; found [max(a)] of type [Max]"
            }
        ],
        "type": "verification_exception",
        "reason": "Found 1 problem\nline 1:26: aggregate function's field must be an attribute or literal; found [max(a)] of type [Max]"
    },
    "status": 400
}

while with this PR returns

{
    "error": {
        "root_cause": [
            {
                "type": "ql_illegal_argument_exception",
                "reason": "Unsupported expression [max(a)]"
            }
        ],
        "type": "ql_illegal_argument_exception",
        "reason": "Unsupported expression [max(a)]"
    },
    "status": 500
}

It's a different error message, but most importantly, it's a different error code (500 vs 400).

Even though the root problem is not strictly related to this PR (it depends on the fact that EVAL command does not properly validate the usage of aggregate functions), IMHO it's worth fixing now, both because of the regression and because it's a 500 error.

Notice that the same problem happens with the following queries:

row a = 1, b = 2 | stats max(a) by max(b)
row a = 1, b = 2 | eval x = max(a)


grouping
: qualifiedName (COMMA qualifiedName)*
: INLINESTATS stats=fields (BY grouping=fields)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR: this was never implemented, we should remove it probably.

@astefan astefan self-requested a review January 17, 2024 09:31
@bpintea
Copy link
Contributor

bpintea commented Jan 17, 2024

@luigi, trying to understand your note:

The error is not so clear (aggregate function's field must be an attribute or literal: which field? And why is that expression unsupported?),

Do you mean we should make it clearer that the outer max is the one the failure is about? As in, the existing explanation found [max(a)] of type [Max] isn't enough?

the root problem is not strictly related with this PR (it depends on the fact that EVAL command does not properly validate the usage of aggregate functions)

Wondering too why we didn't verify aggs in EVAL. But is this issue the same as how we verify the arguments of aggs themselves?

@luigidellaquila
Copy link
Contributor

Do you mean we should make it clearer that the outer max is the one the failure is about? As in, the existing explanation found [max(a)] of type [Max] isn't enough?

@bpintea yeah, I removed that part of the comment already, I realized the message is actually good enough in the context of aggs validation.

Wondering too why we didn't verify aggs in EVAL. But is this issue the same as how we verify the arguments of aggs themselves?

until now they were two different problems, but they could become the same if we rely on EVAL checks after expression extraction from STATS.
IMHO the best solution is to do the aggs validation before the extraction, and in case fail fast.
EVAL validation can be fixed separately probably.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@costin
Copy link
Member Author

costin commented Jan 17, 2024

I've added the validation for eval and stats; the previous validation was checking for something else and caught the case of aggs inside aggs.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@costin costin force-pushed the esql/synthetic-evals branch from 3e4c504 to add8e44 Compare January 17, 2024 22:38
@costin costin merged commit 607185b into elastic:main Jan 17, 2024
@costin costin deleted the esql/synthetic-evals branch January 17, 2024 23:49
@costin
Copy link
Member Author

costin commented Jan 18, 2024

/cc @abdonpijpelink please update the stats docs with the expanded grammar - thanks!

dej611 added a commit to elastic/kibana that referenced this pull request Jan 24, 2024
## Summary

Add support for the validation and autocomplete engine for the feature
introduced in elastic/elasticsearch#104387

The validation engine should not mark the syntax as invalid now.
The autocomplete changes are a little bit more subtle:
* after the `by` option command a `varX` and functions are now suggested
in additions to `[columns]`
* when typing an expression within the `by` scope the autocomplete
should now understand and help with that, without promoting it when not
needed.


![by_suggestions](https://github.com/elastic/kibana/assets/924948/266c403b-34a6-436f-a5aa-43353b77269e)


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: Kibana Machine <[email protected]>
CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this pull request Feb 15, 2024
## Summary

Add support for the validation and autocomplete engine for the feature
introduced in elastic/elasticsearch#104387

The validation engine should not mark the syntax as invalid now.
The autocomplete changes are a little bit more subtle:
* after the `by` option command a `varX` and functions are now suggested
in additions to `[columns]`
* when typing an expression within the `by` scope the autocomplete
should now understand and help with that, without promoting it when not
needed.


![by_suggestions](https://github.com/elastic/kibana/assets/924948/266c403b-34a6-436f-a5aa-43353b77269e)


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: Kibana Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL-ui Impacts ES|QL UI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.13.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: Expressions in aggregate functions

5 participants