-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-34086][SQL] RaiseError generates too much code and may fails codegen in length check for char varchar #31150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…odegen in length check for char varchar
|
cc @cloud-fan @maropu @HyukjinKwon @gatorsmile @dongjoon-hyun, thanks for reviewing |
| override def foldable: Boolean = false | ||
| override def nullable: Boolean = true | ||
| override def dataType: DataType = NullType | ||
| override def dataType: DataType = returnType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the constructor parameter can be dataType directly, then we don't need this override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm, updated
|
Kubernetes integration test starting |
cloud-fan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to me.
|
Kubernetes integration test status success |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133968 has finished for PR 31150 at commit
|
| since = "3.1.0", | ||
| group = "misc_funcs") | ||
| case class RaiseError(child: Expression) extends UnaryExpression with ImplicitCastInputTypes { | ||
| case class RaiseError private[spark] (child: Expression, dataType: DataType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catalyst is already a private module. We can just remove private[spark].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
Test build #133970 has finished for PR 31150 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #133990 has finished for PR 31150 at commit
|
|
thanks, merging to master! |
|
it conflicts with 3.1, @yaooqinn can you send a new PR? |
|
w/ pleasure~ |
…ils codegen in length check for char varchar A backport for #31150 to branch 3.1 ### What changes were proposed in this pull request? https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/ We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression. W/ this fix, for q41 in TPCDS with [Using TPCDS original definitions for char/varchar columns](#31012) applied, we can reduce the stage code-gen size from 22523 to 14369 ``` 14369 - 22523 = - 8154 ``` ### Why are the changes needed? fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? modified uts Closes #31168 from yaooqinn/SPARK-34086-31. Authored-by: Kent Yao <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/
We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression.
W/ this fix, for q41 in TPCDS with Using TPCDS original definitions for char/varchar columns applied, we can reduce the stage code-gen size from 22523 to 14369
Why are the changes needed?
fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails
Does this PR introduce any user-facing change?
no
How was this patch tested?
modified uts