Skip to content

Conversation

@dilipbiswal
Copy link
Contributor

@dilipbiswal dilipbiswal commented Jan 31, 2017

What changes were proposed in this pull request?

This PR adds the second set of tests for EXISTS subquery.

File name Brief description
exists-aggregate.sql Tests aggregate expressions in outer query and EXISTS subquery.
exists-having.sql Tests HAVING clause in subquery.
exists-orderby-limit.sql Tests EXISTS subquery support with ORDER BY and LIMIT clauses.

DB2 results are attached here as reference :

exists-aggregate-db2.txt
exists-having-db2.txt
exists-orderby-limit-db2.txt

How the patch was tested.

The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct.

struct<emp_name:string,bonus_amt:double>
-- !query 10 output
emp 5 1000.0
emp 6 - no dept 500.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have compared the result set matched with the result from DB2.

-- !query 7 schema
struct<dept_id:int,dept_name:string,state:string>
-- !query 7 output
10 dept 1 CA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have compared the result set matched with the result from DB2.

500 emp 5 2001-01-01 400.0 NULL
600 emp 6 - no dept 2001-01-01 400.0 100
700 emp 7 2010-01-01 400.0 100
800 emp 8 2016-01-01 150.0 70
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have compared the result set matched with the result from DB2.

@SparkQA
Copy link

SparkQA commented Feb 1, 2017

Test build #72208 has finished for PR 16760 at commit 2473e0c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

cc @hvanhovell @gatorsmile @nsyca
@nsyca Thanks for reviewing.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the comparison. The results match what DB2 returned. Do you think we can remove some duplicate test cases in SubquerySuite.scala?

@dilipbiswal
Copy link
Contributor Author

@gatorsmile Thanks a lot for reviewing. @nsyca and i had a brief discussion about this. Yeah, we will remove the duplicate test cases from SubquerySuite.scala. However , we wanted to keep all the tests till the phase1 work is committed to ensure the full coverage i.e we don't inadvertently remove any tests for which we may have missed to add a corresponding test in sqltestquerysuite.

@gatorsmile
Copy link
Member

@dilipbiswal Does that mean your test cases cover all the EXISTS scenarios in SubquerySuite.scala?

@dilipbiswal
Copy link
Contributor Author

dilipbiswal commented Feb 6, 2017

@gatorsmile Currently for exists we are missing 1) Negative scenarios 2) Generators (there is one test case in subquerysuite for it)

@gatorsmile
Copy link
Member

@dilipbiswal Are you planning to submit another PR for Generators or do it in this PR?

@dilipbiswal
Copy link
Contributor Author

@gatorsmile Yeah Sean. Actually most likely i will need to work out a different schema than what i have currently for the generator tests. So i was planning to add the negative scenarios and generator tests in another PR.

@gatorsmile
Copy link
Member

ok. LGTM

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Feb 8, 2017

Test build #72558 has finished for PR 16760 at commit 2473e0c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 8, 2017

Test build #72572 has finished for PR 16760 at commit 2473e0c.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@nsyca
Copy link
Contributor

nsyca commented Feb 8, 2017

retest this please

@SparkQA
Copy link

SparkQA commented Feb 8, 2017

Test build #72585 has finished for PR 16760 at commit 2473e0c.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 8, 2017

Test build #72602 has finished for PR 16760 at commit 2473e0c.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 9, 2017

Test build #72621 has finished for PR 16760 at commit 2473e0c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in 64cae22 Feb 9, 2017
@dilipbiswal
Copy link
Contributor Author

Many thanks @gatorsmile

cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
…te, Having, Orderby, Limit)

## What changes were proposed in this pull request?
This PR adds the second set of tests for EXISTS subquery.

File name                        | Brief description
------------------------| -----------------
exists-aggregate.sql              |Tests aggregate expressions in outer query and EXISTS subquery.
exists-having.sql|Tests HAVING clause in subquery.
exists-orderby-limit.sql|Tests EXISTS subquery support with ORDER BY and LIMIT clauses.

DB2 results are attached here as reference :

[exists-aggregate-db2.txt](https://github.com/apache/spark/files/743287/exists-aggregate-db2.txt)
[exists-having-db2.txt](https://github.com/apache/spark/files/743286/exists-having-db2.txt)
[exists-orderby-limit-db2.txt](https://github.com/apache/spark/files/743288/exists-orderby-limit-db2.txt)

##  How the patch was tested.
The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct.

Author: Dilip Biswal <[email protected]>

Closes apache#16760 from dilipbiswal/exists-pr2.
asfgit pushed a commit that referenced this pull request Mar 14, 2017
…ll up to Optimizer phase

## What changes were proposed in this pull request?
Currently Analyzer as part of ResolveSubquery, pulls up the correlated predicates to its
originating SubqueryExpression. The subquery plan is then transformed to remove the correlated
predicates after they are moved up to the outer plan. In this PR, the task of pulling up
correlated predicates is deferred to Optimizer. This is the initial work that will allow us to
support the form of correlated subqueries that we don't support today. The design document
from nsyca can be found in the following link :
[DesignDoc](https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit#)

The brief description of code changes (hopefully to aid with code review) can be be found in the
following link:
[CodeChanges](https://docs.google.com/document/d/18mqjhL9V1An-tNta7aVE13HkALRZ5GZ24AATA-Vqqf0/edit#)

## How was this patch tested?
The test case PRs were submitted earlier using.
[16337](#16337) [16759](#16759) [16841](#16841) [16915](#16915) [16798](#16798) [16712](#16712) [16710](#16710) [16760](#16760) [16802](#16802)

Author: Dilip Biswal <[email protected]>

Closes #16954 from dilipbiswal/SPARK-18874.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants