-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-49695][SQL] Postgres fix xor push-down #48144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can we add some tests? |
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if you add an integration test, for instance here *PostgresExpressionPushdownSuite*
|
I wanted to add tests, but didn't know where to put them... Don't know how oss tests postgre. |
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @andrej-db . +1 for adding a test case definitely.
cc @huaxingao , too.
|
Added the test, let me know if this is in order. |
| testDatetime(s"$catalogAndNamespace.${caseConvert("datetime")}") | ||
| } | ||
|
|
||
| test("xor operator push-down") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can do explain formatted and check whether string contains "id" # 3, that will add a little bit of robustness.
Another thing we can do is to make unit test, and just invoke compilation of XOR expression and check whether col # constant is result of compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat, I like it
urosstan-db
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving latest iteration (with tests)
PostgresIntegrationSuite: add test
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, fix your test:
[info] - SPARK-49695: Postgres fix xor push-down *** FAILED *** (10 milliseconds)
[info] org.apache.spark.sql.catalyst.ExtendedAnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `bar` cannot be found. Verify the spelling and correctness of the schema and catalog.
|
|
||
| override def compileExpression(expr: Expression): Option[String] = { | ||
| val builder = new PostgresSQLBuilder() | ||
| try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps its better to only override visitBinaryArithmetics?
We have similar problem with some of the functions and we use dialectFunctionName to translate from Spark to local dialect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 here, let's override last possible method in chain of execution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
milastdbx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refactoring this
...integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala
Outdated
Show resolved
Hide resolved
urosstan-db
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please make test more assertive.
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrej-db Could you fix builds and retrigger intergration tests:
[error] /home/runner/work/spark/spark/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala:237:25: not found: type Filter
[error] plan.isInstanceOf[Filter]
[error] ^
[error] one error found
| assert(rows.length == 1) | ||
| assert(rows(0).getInt(0) === 6) | ||
| assert(rows(0).getString(1) === "jen") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use checkAnswer, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file has mainly these asserts, wanted to make it in line with other tests
...integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala
Outdated
Show resolved
Hide resolved
…park/sql/jdbc/v2/PostgresIntegrationSuite.scala Co-authored-by: Maxim Gekk <[email protected]>
|
+1, LGTM. Merging to master/3.5/3.4. |
|
@andrej-db Could you open separate PRs for 3.4 and 3.5 because your changes cause conflicts in the branches. |
|
To @MaxGekk and @andrej-db , Apache Spark 3.4 reached the End-Of-Support. Only Also, cc @LuciferYang since he is the release manager for Apache Spark 3.5.4. |
|
Thanks @dongjoon-hyun |
### What changes were proposed in this pull request? Backport of the #48144 This PR fixes the pushdown of ^ operator (XOR operator) for Postgres. Those two databases use this as exponent, rather then bitwise xor. Fix is consisted of overriding the SQLExpressionBuilder to replace the '^' character with '#'. ### Why are the changes needed? Result is incorrect. ### Does this PR introduce _any_ user-facing change? Yes. The user will now have a proper translation of the ^ operator. ### How was this patch tested? ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49071 from andrej-db/PGXORBackport. Lead-authored-by: Andrej Gobeljić <[email protected]> Co-authored-by: andrej-gobeljic_data <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR fixes the pushdown of ^ operator (XOR operator) for Postgres. Those two databases use this as exponent, rather then bitwise xor.
Fix is consisted of overriding the SQLExpressionBuilder to replace the '^' character with '#'.
Why are the changes needed?
Result is incorrect.
Does this PR introduce any user-facing change?
Yes. The user will now have a proper translation of the ^ operator.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.