-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-30940][SQL] Remove attributeId in auto-generated arguments when Explain SQL query #27685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #118876 has finished for PR 27685 at commit
|
|
Test build #118880 has finished for PR 27685 at commit
|
3cf84d2 to
8238b2a
Compare
|
Test build #119063 has finished for PR 27685 at commit
|
|
@cloud-fan Would you please help review this? Thanks so much! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still useful to have the attr id as the name can be duplicated
scala> sql("select 1 as a, 2 as a").explain
== Physical Plan ==
*(1) Project [1 AS a#53, 2 AS a#54]
+- Scan OneRowRelation[]
I think we should only remove the attr id from the auto-generated alias name. e.g. this should be [max(val)#x]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for the review :-)
The common modification in AttributeReference will also remove the # in [max(val)#x]. I'm trying to pin point to the exact creation place of auto-generated AttributeReference name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I've re-implemented to only remove the auto-generated Attrids I think. Would you please help review again? Thanks so much!
|
Test build #119179 has finished for PR 27685 at commit
|
|
Test build #119178 has finished for PR 27685 at commit
|
|
Test build #119198 has finished for PR 27685 at commit
|
|
Test build #119203 has finished for PR 27685 at commit
|
|
retest this please |
|
Test build #119220 has finished for PR 27685 at commit
|
|
Test build #119547 has finished for PR 27685 at commit
|
|
cc @cloud-fan @gatorsmile @maropu @maryannxue , thanks! |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
Outdated
Show resolved
Hide resolved
| -- !query output | ||
| org.apache.spark.sql.AnalysisException | ||
| grouping expressions sequence is empty, and 'spark_catalog.default.test_having.`a`' is not an aggregate function. Wrap '(min(spark_catalog.default.test_having.`a`) AS `min(a#x)`, max(spark_catalog.default.test_having.`a`) AS `max(a#x)`)' in windowing function(s) or wrap 'spark_catalog.default.test_having.`a`' in first() (or first_value) if you don't care which value you get.; | ||
| grouping expressions sequence is empty, and 'spark_catalog.default.test_having.`a`' is not an aggregate function. Wrap '(min(spark_catalog.default.test_having.`a`) AS `min(a)`, max(spark_catalog.default.test_having.`a`) AS `max(a)`)' in windowing function(s) or wrap 'spark_catalog.default.test_having.`a`' in first() (or first_value) if you don't care which value you get.; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to hide ids even in error messages, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flatArgument/flatArgumentsString is tightly bind with toString of Expression/AggregateExpression, which will commonly affect all of them. I'll try more to eliminate the impact, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message explicitly called map(_.sql) instead of default toString. The Alias.sql is using name field which has already been formatted by flatArguments when constructing Alias. So this is also following the flatArgument framework.
As the error message is intended to suggest user with a sql snippet, maybe it's better to not includ #exprId anyway? Thanks.
| flatArguments.toSeq, "(", ", ", ")", SQLConf.get.maxToStringFields) | ||
| flatArgumentStrings.toSeq, "(", ", ", ")", SQLConf.get.maxToStringFields) | ||
|
|
||
| def argumentString: String = toString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to only add one method? I'm worried about adding to many methods to the framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the argumentString is needed because AttributeReference already overwrite toString, thus we need the new abstract string function to switch to non-exprid format. For flatArgumentStrings, it only have two callers. I refactored the toAggString of AggregateFunction, then we don't need to add the method flatArgumentStrings in Expression but just implement it within toString. See commit b74c500.
|
Test build #119570 has finished for PR 27685 at commit
|
|
Test build #119573 has finished for PR 27685 at commit
|
|
Test build #119597 has finished for PR 27685 at commit
|
|
Test build #119607 has finished for PR 27685 at commit
|
|
Test build #119916 has finished for PR 27685 at commit
|
|
Test build #119928 has finished for PR 27685 at commit
|
|
@cloud-fan Would you please help review the latest change? Thanks so much :-) |
|
@cloud-fan @gatorsmile @maryannxue Would you please help review this PR? Thanks so much :-) |
|
@Eric5553 Could you resolve the conflict? |
|
Updated. Thanks so much for helping maintain the PR ! @maropu |
|
Test build #121037 has finished for PR 27685 at commit
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
When EXPLAIN sql query, the auto-generated argument alias shouldn't include expr/attribute id. This will provide better readability of Explain results. This is a follow-up to address #27368 (comment).
Before
After
Why are the changes needed?
Provide better readability for Explain result
Does this PR introduce any user-facing change?
Update Explain result to a better format
How was this patch tested?
Update existing tests