-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-29096][SQL] The exact math method should be called only when there is a corresponding function in Math #25804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29096][SQL] The exact math method should be called only when there is a corresponding function in Math #25804
Conversation
| localSparkSession.conf.set(SQLConf.CROSS_JOINS_ENABLED.key, true) | ||
| localSparkSession.conf.set(SQLConf.ANSI_SQL_PARSER.key, true) | ||
| localSparkSession.conf.set(SQLConf.PREFER_INTEGRAL_DIVISION.key, true) | ||
| localSparkSession.conf.set(SQLConf.FAIL_ON_INTEGRAL_TYPE_OVERFLOW.key, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of setting it always to true, I think we should set it to both true and false in the tests which are relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to enable it for all pgSQL tests:
- If we are going to have one flag for ANSI mode such as [SPARK-28989][SQL] Introduce ANSI SQL Dialect #25693, then we should enable all the ANSI features here.
- If we have to isolate this flag, then we need to isolate
SQLConf.ANSI_SQL_PARSERas well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this way we are running all our tests only in one of the two modes (the non-default one moreover), so we are not ensuring the behavior on both modes. I don't think this is a good idea. I think that when we will have the ANSI feature, we will run these tests both with ANSI enabled and not. Otherwise we would not provide proper coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. I think the cause of the issue is test coverage.
Testing both modes for pgSQL tests seems great. The total time of pgSQL tests is around 7 minutes in my local setup, and org.apache.spark.sql.SQLQueryTestSuite is executed in parallel(see SparkBuild.scala).
I have created a follow-up https://issues.apache.org/jira/browse/SPARK-29098 for this.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
Show resolved
Hide resolved
|
Test build #110630 has finished for PR 25804 at commit
|
| localSparkSession.conf.set(SQLConf.FAIL_ON_INTEGRAL_TYPE_OVERFLOW.key, true) | ||
| // Propagate the SQL conf FAIL_ON_INTEGRAL_TYPE_OVERFLOW to executor. | ||
| // TODO: remove this after SPARK-29122 is resolved. | ||
| localSparkSession.sparkContext.setLocalProperty( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before https://issues.apache.org/jira/browse/SPARK-29122 is resolved, let's propagate the conf FAIL_ON_INTEGRAL_TYPE_OVERFLOW to executor.
The issue is test only.
|
Test build #110794 has started for PR 25804 at commit |
|
test this please |
|
Test build #110804 has finished for PR 25804 at commit
|
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick look and seems fine to me.
|
LGTM. I think for pgsql tests, we should always set dialect to pgsql (once we have that config). For other golden file tests, we should test with ansi mode on and off. We can do it later. |
|
thanks, merging to master! |
What changes were proposed in this pull request?
However, only
Add/Substract/Multiplyhas a corresponding exact function in java.lang.Math . When the option "spark.sql.failOnIntegralTypeOverflow" is enabled, a runtime exception "BinaryArithmetics must override either exactMathMethod or genCode" is thrown if the other Binary Arithmetic operators are used, such as "Divide", "Remainder".The exact math method should be called only when there is a corresponding function in
java.lang.MathInt/Shortspark.sql.failOnIntegralTypeOverflowfor pgSQL tests inSQLQueryTestSuite.Why are the changes needed?
spark.sql.failOnIntegralTypeOverflow.Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit test.