-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, Having, Orderby, Limit) #16760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…te , Having, Orderby, Limit)
| struct<emp_name:string,bonus_amt:double> | ||
| -- !query 10 output | ||
| emp 5 1000.0 | ||
| emp 6 - no dept 500.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have compared the result set matched with the result from DB2.
| -- !query 7 schema | ||
| struct<dept_id:int,dept_name:string,state:string> | ||
| -- !query 7 output | ||
| 10 dept 1 CA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have compared the result set matched with the result from DB2.
| 500 emp 5 2001-01-01 400.0 NULL | ||
| 600 emp 6 - no dept 2001-01-01 400.0 100 | ||
| 700 emp 7 2010-01-01 400.0 100 | ||
| 800 emp 8 2016-01-01 150.0 70 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have compared the result set matched with the result from DB2.
|
Test build #72208 has finished for PR 16760 at commit
|
|
cc @hvanhovell @gatorsmile @nsyca |
gatorsmile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did the comparison. The results match what DB2 returned. Do you think we can remove some duplicate test cases in SubquerySuite.scala?
|
@gatorsmile Thanks a lot for reviewing. @nsyca and i had a brief discussion about this. Yeah, we will remove the duplicate test cases from SubquerySuite.scala. However , we wanted to keep all the tests till the phase1 work is committed to ensure the full coverage i.e we don't inadvertently remove any tests for which we may have missed to add a corresponding test in sqltestquerysuite. |
|
@dilipbiswal Does that mean your test cases cover all the EXISTS scenarios in |
|
@gatorsmile Currently for exists we are missing 1) Negative scenarios 2) Generators (there is one test case in subquerysuite for it) |
|
@dilipbiswal Are you planning to submit another PR for |
|
@gatorsmile Yeah Sean. Actually most likely i will need to work out a different schema than what i have currently for the generator tests. So i was planning to add the negative scenarios and generator tests in another PR. |
|
ok. LGTM |
|
retest this please |
|
Test build #72558 has finished for PR 16760 at commit
|
|
retest this please |
|
Test build #72572 has finished for PR 16760 at commit
|
|
retest this please |
|
Test build #72585 has finished for PR 16760 at commit
|
|
retest this please |
|
Test build #72602 has finished for PR 16760 at commit
|
|
retest this please |
|
Test build #72621 has finished for PR 16760 at commit
|
|
Thanks! Merging to master. |
|
Many thanks @gatorsmile |
…te, Having, Orderby, Limit) ## What changes were proposed in this pull request? This PR adds the second set of tests for EXISTS subquery. File name | Brief description ------------------------| ----------------- exists-aggregate.sql |Tests aggregate expressions in outer query and EXISTS subquery. exists-having.sql|Tests HAVING clause in subquery. exists-orderby-limit.sql|Tests EXISTS subquery support with ORDER BY and LIMIT clauses. DB2 results are attached here as reference : [exists-aggregate-db2.txt](https://github.com/apache/spark/files/743287/exists-aggregate-db2.txt) [exists-having-db2.txt](https://github.com/apache/spark/files/743286/exists-having-db2.txt) [exists-orderby-limit-db2.txt](https://github.com/apache/spark/files/743288/exists-orderby-limit-db2.txt) ## How the patch was tested. The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct. Author: Dilip Biswal <[email protected]> Closes apache#16760 from dilipbiswal/exists-pr2.
…ll up to Optimizer phase ## What changes were proposed in this pull request? Currently Analyzer as part of ResolveSubquery, pulls up the correlated predicates to its originating SubqueryExpression. The subquery plan is then transformed to remove the correlated predicates after they are moved up to the outer plan. In this PR, the task of pulling up correlated predicates is deferred to Optimizer. This is the initial work that will allow us to support the form of correlated subqueries that we don't support today. The design document from nsyca can be found in the following link : [DesignDoc](https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit#) The brief description of code changes (hopefully to aid with code review) can be be found in the following link: [CodeChanges](https://docs.google.com/document/d/18mqjhL9V1An-tNta7aVE13HkALRZ5GZ24AATA-Vqqf0/edit#) ## How was this patch tested? The test case PRs were submitted earlier using. [16337](#16337) [16759](#16759) [16841](#16841) [16915](#16915) [16798](#16798) [16712](#16712) [16710](#16710) [16760](#16760) [16802](#16802) Author: Dilip Biswal <[email protected]> Closes #16954 from dilipbiswal/SPARK-18874.
What changes were proposed in this pull request?
This PR adds the second set of tests for EXISTS subquery.
DB2 results are attached here as reference :
exists-aggregate-db2.txt
exists-having-db2.txt
exists-orderby-limit-db2.txt
How the patch was tested.
The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct.