-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN subquery 4th batch #16915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
get latest code from upstream
adding trim characters support
get latest code for pr12646
merge latest code
merge upstream/master
| ON t1c = t2c | ||
| LEFT JOIN t3 | ||
| ON t2d = t3d ) AND | ||
| t1a = "val1b") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style looks strange. Could you adjust them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I have adjust the style and resubmit. thanks.
|
ok to test |
|
Test build #72846 has finished for PR 16915 at commit
|
|
@kevinyu98 @nsyca @dilipbiswal could someone confirm that these results match DB2? I also think that this PR is almost too large. |
| 1 10 NULL 2014-08-04 | ||
| 1 10 NULL 2014-09-04 | ||
| 1 10 NULL 2015-05-04 | ||
| 1 10 NULL 2014-05-04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the results are equivalent with the ones from DB2.
| struct<t1a:string,t1b:smallint,t1c:int,t1d:bigint,t1h:timestamp> | ||
| -- !query 12 output | ||
| val1b 8 16 19 2014-05-04 01:01:00 | ||
| val1c 8 16 19 2014-05-04 01:02:00.001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the results are equivalent with the ones from DB2.
| -- !query 8 schema | ||
| struct<count(DISTINCT t1a):bigint,t1b:smallint,t1c:int,t1d:bigint> | ||
| -- !query 8 output | ||
| 1 6 8 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the results are equivalent with the ones from DB2.
|
It's larger than typical test PRs we submitted for the subquery JIRA but since it's the last test PR, we think we wanted to avoid an additional round of administrative work. |
|
retest this please |
|
Test build #72952 has finished for PR 16915 at commit
|
|
LGTM. Merging to master. |
|
@gatorsmile thanks a lot. |
…atch ## What changes were proposed in this pull request? This is 4th batch of test case for IN/NOT IN subquery. In this PR, it has these test files: `in-set-operations.sql` `in-with-cte.sql` `not-in-joins.sql` Here are the queries and results from running on DB2. [in-set-operations DB2 version](https://github.com/apache/spark/files/772846/in-set-operations.sql.db2.txt) [Output of in-set-operations](https://github.com/apache/spark/files/772848/in-set-operations.sql.db2.out.txt) [in-with-cte DB2 version](https://github.com/apache/spark/files/772849/in-with-cte.sql.db2.txt) [Output of in-with-cte](https://github.com/apache/spark/files/772856/in-with-cte.sql.db2.out.txt) [not-in-joins DB2 version](https://github.com/apache/spark/files/772851/not-in-joins.sql.db2.txt) [Output of not-in-joins](https://github.com/apache/spark/files/772852/not-in-joins.sql.db2.out.txt) ## How was this patch tested? This pr is adding new test cases. We compare the result from spark with the result from another RDBMS(We used DB2 LUW). If the results are the same, we assume the result is correct. Author: Kevin Yu <[email protected]> Closes apache#16915 from kevinyu98/spark-18871-44.
…ll up to Optimizer phase ## What changes were proposed in this pull request? Currently Analyzer as part of ResolveSubquery, pulls up the correlated predicates to its originating SubqueryExpression. The subquery plan is then transformed to remove the correlated predicates after they are moved up to the outer plan. In this PR, the task of pulling up correlated predicates is deferred to Optimizer. This is the initial work that will allow us to support the form of correlated subqueries that we don't support today. The design document from nsyca can be found in the following link : [DesignDoc](https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit#) The brief description of code changes (hopefully to aid with code review) can be be found in the following link: [CodeChanges](https://docs.google.com/document/d/18mqjhL9V1An-tNta7aVE13HkALRZ5GZ24AATA-Vqqf0/edit#) ## How was this patch tested? The test case PRs were submitted earlier using. [16337](#16337) [16759](#16759) [16841](#16841) [16915](#16915) [16798](#16798) [16712](#16712) [16710](#16710) [16760](#16760) [16802](#16802) Author: Dilip Biswal <[email protected]> Closes #16954 from dilipbiswal/SPARK-18874.
What changes were proposed in this pull request?
This is 4th batch of test case for IN/NOT IN subquery. In this PR, it has these test files:
in-set-operations.sqlin-with-cte.sqlnot-in-joins.sqlHere are the queries and results from running on DB2.
in-set-operations DB2 version
Output of in-set-operations
in-with-cte DB2 version
Output of in-with-cte
not-in-joins DB2 version
Output of not-in-joins
How was this patch tested?
This pr is adding new test cases. We compare the result from spark with the result from another RDBMS(We used DB2 LUW). If the results are the same, we assume the result is correct.