[SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN subquery 4th batch #16915

kevinyu98 · 2017-02-13T19:07:56Z

What changes were proposed in this pull request?

This is 4th batch of test case for IN/NOT IN subquery. In this PR, it has these test files:

in-set-operations.sql
in-with-cte.sql
not-in-joins.sql

Here are the queries and results from running on DB2.

in-set-operations DB2 version
Output of in-set-operations
in-with-cte DB2 version
Output of in-with-cte
not-in-joins DB2 version
Output of not-in-joins

How was this patch tested?

This pr is adding new test cases. We compare the result from spark with the result from another RDBMS(We used DB2 LUW). If the results are the same, we assume the result is correct.

get latest code from upstream

adding trim characters support

get latest code for pr12646

merge latest code

merge upstream/master

gatorsmile · 2017-02-13T21:41:44Z

sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-with-cte.sql.out

+                         ON         t1c = t2c
+                         LEFT JOIN  t3
+                         ON         t2d = t3d ) AND
+              t1a = "val1b")


The style looks strange. Could you adjust them?

sure, I have adjust the style and resubmit. thanks.

gatorsmile · 2017-02-14T03:37:32Z

ok to test

SparkQA · 2017-02-14T05:52:08Z

Test build #72846 has finished for PR 16915 at commit 3dd57fd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2017-02-15T20:27:38Z

@kevinyu98 @nsyca @dilipbiswal could someone confirm that these results match DB2?

I also think that this PR is almost too large.

nsyca · 2017-02-15T20:25:06Z

sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-set-operations.sql.out

+1	10	NULL	2014-08-04
+1	10	NULL	2014-09-04
+1	10	NULL	2015-05-04
+1	10	NULL	2014-05-04


All the results are equivalent with the ones from DB2.

nsyca · 2017-02-15T20:27:30Z

sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-with-cte.sql.out

+struct<t1a:string,t1b:smallint,t1c:int,t1d:bigint,t1h:timestamp>
+-- !query 12 output
+val1b	8	16	19	2014-05-04 01:01:00
+val1c	8	16	19	2014-05-04 01:02:00.001


All the results are equivalent with the ones from DB2.

nsyca · 2017-02-15T20:29:45Z

sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/not-in-joins.sql.out

+-- !query 8 schema
+struct<count(DISTINCT t1a):bigint,t1b:smallint,t1c:int,t1d:bigint>
+-- !query 8 output
+1	6	8	10


All the results are equivalent with the ones from DB2.

nsyca · 2017-02-15T20:32:45Z

It's larger than typical test PRs we submitted for the subquery JIRA but since it's the last test PR, we think we wanted to avoid an additional round of administrative work.

gatorsmile · 2017-02-15T20:52:43Z

retest this please

SparkQA · 2017-02-15T23:20:56Z

Test build #72952 has finished for PR 16915 at commit 3dd57fd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-02-16T05:29:42Z

LGTM. Merging to master.

kevinyu98 · 2017-02-16T06:18:57Z

@gatorsmile thanks a lot.

…atch ## What changes were proposed in this pull request? This is 4th batch of test case for IN/NOT IN subquery. In this PR, it has these test files: `in-set-operations.sql` `in-with-cte.sql` `not-in-joins.sql` Here are the queries and results from running on DB2. [in-set-operations DB2 version](https://github.com/apache/spark/files/772846/in-set-operations.sql.db2.txt) [Output of in-set-operations](https://github.com/apache/spark/files/772848/in-set-operations.sql.db2.out.txt) [in-with-cte DB2 version](https://github.com/apache/spark/files/772849/in-with-cte.sql.db2.txt) [Output of in-with-cte](https://github.com/apache/spark/files/772856/in-with-cte.sql.db2.out.txt) [not-in-joins DB2 version](https://github.com/apache/spark/files/772851/not-in-joins.sql.db2.txt) [Output of not-in-joins](https://github.com/apache/spark/files/772852/not-in-joins.sql.db2.out.txt) ## How was this patch tested? This pr is adding new test cases. We compare the result from spark with the result from another RDBMS(We used DB2 LUW). If the results are the same, we assume the result is correct. Author: Kevin Yu <[email protected]> Closes apache#16915 from kevinyu98/spark-18871-44.

…ll up to Optimizer phase ## What changes were proposed in this pull request? Currently Analyzer as part of ResolveSubquery, pulls up the correlated predicates to its originating SubqueryExpression. The subquery plan is then transformed to remove the correlated predicates after they are moved up to the outer plan. In this PR, the task of pulling up correlated predicates is deferred to Optimizer. This is the initial work that will allow us to support the form of correlated subqueries that we don't support today. The design document from nsyca can be found in the following link : [DesignDoc](https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit#) The brief description of code changes (hopefully to aid with code review) can be be found in the following link: [CodeChanges](https://docs.google.com/document/d/18mqjhL9V1An-tNta7aVE13HkALRZ5GZ24AATA-Vqqf0/edit#) ## How was this patch tested? The test case PRs were submitted earlier using. [16337](#16337) [16759](#16759) [16841](#16841) [16915](#16915) [16798](#16798) [16712](#16712) [16710](#16710) [16760](#16760) [16802](#16802) Author: Dilip Biswal <[email protected]> Closes #16954 from dilipbiswal/SPARK-18874.

kevinyu98 added 30 commits April 20, 2016 11:06

adding testcase

3b44c59

Merge remote-tracking branch 'upstream/master'

18b4a31

Merge remote-tracking branch 'upstream/master'

4f4d1c8

get latest code from upstream

Merge remote-tracking branch 'upstream/master'

f5f0cbe

adding trim characters support

Merge remote-tracking branch 'upstream/master'

d8b2edb

get latest code for pr12646

Merge remote-tracking branch 'upstream/master'

196b6c6

merge latest code

Merge remote-tracking branch 'upstream/master'

f37a01e

merge upstream/master

Merge remote-tracking branch 'upstream/master'

bb5b01f

Merge remote-tracking branch 'upstream/master'

bde5820

Merge remote-tracking branch 'upstream/master'

5f7cd96

Merge remote-tracking branch 'upstream/master'

893a49a

Merge remote-tracking branch 'upstream/master'

4bbe1fd

Merge remote-tracking branch 'upstream/master'

b2dd795

Merge remote-tracking branch 'upstream/master'

8c3e5da

Merge remote-tracking branch 'upstream/master'

a0eaa40

Merge remote-tracking branch 'upstream/master'

d03c940

Merge remote-tracking branch 'upstream/master'

d728d5e

Merge remote-tracking branch 'upstream/master'

ea104dd

Merge remote-tracking branch 'upstream/master'

6ab1215

Merge remote-tracking branch 'upstream/master'

0c56653

Merge remote-tracking branch 'upstream/master'

d7a1874

Merge remote-tracking branch 'upstream/master'

85d3500

Merge remote-tracking branch 'upstream/master'

c056f91

Merge remote-tracking branch 'upstream/master'

0b8189d

Merge remote-tracking branch 'upstream/master'

c2ea31d

Merge remote-tracking branch 'upstream/master'

a2d3056

Merge remote-tracking branch 'upstream/master'

39e5648

Merge remote-tracking branch 'upstream/master'

b9370a3

Merge remote-tracking branch 'upstream/master'

01224a4

Merge remote-tracking branch 'upstream/master'

d05d39a

kevinyu98 added 15 commits September 1, 2016 11:03

Merge remote-tracking branch 'upstream/master'

2e399d9

Merge remote-tracking branch 'upstream/master'

0ef59bc

Merge remote-tracking branch 'upstream/master'

6fad85f

Merge remote-tracking branch 'upstream/master'

5525dff

Merge remote-tracking branch 'upstream/master'

63715e4

Merge remote-tracking branch 'upstream/master'

a084410

Merge remote-tracking branch 'upstream/master'

b6e5b97

Merge remote-tracking branch 'upstream/master'

bdd5423

Merge remote-tracking branch 'upstream/master'

6638336

Merge remote-tracking branch 'upstream/master'

f89863e

Merge remote-tracking branch 'upstream/master'

b48993c

Merge remote-tracking branch 'upstream/master'

9324514

Merge remote-tracking branch 'upstream/master'

e2b78b8

add test cases

5eb69cb

generate result files

ebfb22d

gatorsmile reviewed Feb 13, 2017

View reviewed changes

adjust test case style

3dd57fd

nsyca reviewed Feb 15, 2017

View reviewed changes

asfgit closed this in 8487902 Feb 16, 2017

dilipbiswal mentioned this pull request Feb 16, 2017

[SPARK-18874][SQL] First phase: Deferring the correlated predicate pull up to Optimizer phase #16954

Closed

[SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN subquery 4th batch #16915

[SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN subquery 4th batch #16915

Uh oh!

Conversation

kevinyu98 commented Feb 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

gatorsmile Feb 13, 2017

Choose a reason for hiding this comment

Uh oh!

kevinyu98 Feb 14, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile commented Feb 14, 2017

Uh oh!

SparkQA commented Feb 14, 2017

Uh oh!

hvanhovell commented Feb 15, 2017

Uh oh!

nsyca Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

nsyca Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

nsyca Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

nsyca commented Feb 15, 2017

Uh oh!

gatorsmile commented Feb 15, 2017

Uh oh!

SparkQA commented Feb 15, 2017

Uh oh!

gatorsmile commented Feb 16, 2017

Uh oh!

kevinyu98 commented Feb 16, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kevinyu98 commented Feb 13, 2017 •

edited

Loading