Skip to content

Conversation

@mauropalsgraaf
Copy link
Contributor

What changes were proposed in this pull request?

It proposes a version in which nullable expressions are not valid in the limit clause

How was this patch tested?

It was tested with unit and e2e tests.

Please review http://spark.apache.org/contributing.html before opening a pull request.

@hvanhovell
Copy link
Contributor

Ok to test

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NiharS
Copy link
Contributor

NiharS commented Jul 18, 2018

New to SQL, but it seems like the query

SELECT 1 LIMIT CAST('1' AS INT)

should work, right? I tried both on Spark without to your change and the W3Schools SQL tester and it's accepted in both, but I tried on your PR and it hits the new AnalysisException.

If this is indeed an issue, it can be avoided by having the case be e.eval() == null instead of e.nullable, although there's some duplicated work since the previous case also calls e.eval()

@hvanhovell
Copy link
Contributor

@NiharS yeah that makes sense. @mauropalsgraaf we missed this today (sorry about that). Can you add the null check (bonus points if you call eval() only once), add a test for this case?

@NiharS
Copy link
Contributor

NiharS commented Jul 25, 2018

Hey @mauropalsgraaf just wanted to check in on this. Have you run into any additional issues or have any questions for this fix?

@mauropalsgraaf
Copy link
Contributor Author

Sorry for my absence, will look into it now

@mauropalsgraaf
Copy link
Contributor Author

mauropalsgraaf commented Jul 28, 2018

@hvanhovell @NiharS I've added a test case and now it works properly for the given example in the description. Regarding the comment: "make sure eval is only called once", is this what you ment?

@HyukjinKwon
Copy link
Member

ok to test

case e if e.dataType != IntegerType => failAnalysis(
s"The limit expression must be integer type, but got " +
e.dataType.catalogString)
case e if e.eval().asInstanceOf[Int] < 0 => failAnalysis(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

case e => e.eval() match {
  case null => failAnalysis(
    s"The evaluated limit expression must not be null, but got ${limitExpr.sql}")
  case v: Int if v < 0 => failAnalysis(
    s"The limit expression must be equal to or greater than 0, but got $v")
  case _ => // OK
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally not a fan of this because of the nesting, but I guess it's a matter of taste. Is there any guidelines for this regarding to styling? I don't have a strong opinion about this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope but it's shorter and I don't think this code makes less readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@SparkQA
Copy link

SparkQA commented Jul 30, 2018

Test build #93773 has finished for PR 21807 at commit c0eb690.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mauropalsgraaf
Copy link
Contributor Author

@hvanhovell Do you know what this unknown error code, -9 means?

@HyukjinKwon
Copy link
Member

that means Jenkins build was halted unexpectedly. usually unrelated with some changes in a pr.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 30, 2018

Test build #93787 has finished for PR 21807 at commit c0eb690.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 30, 2018

Test build #93805 has finished for PR 21807 at commit 60b9af3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 31, 2018

Test build #93817 has finished for PR 21807 at commit 60b9af3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

@mauropalsgraaf Could you fix the PR title?

@mauropalsgraaf mauropalsgraaf changed the title [SPARK-24536] Validate that limit clause cannot have a nullable expression [SPARK-24536] Validate that an evaluated limit clause cannot be null Jul 31, 2018
@asfgit asfgit closed this in 4ac2126 Jul 31, 2018
@gatorsmile
Copy link
Member

LGTM

Thanks! Merged to master/2.3

asfgit pushed a commit that referenced this pull request Jul 31, 2018
It proposes a version in which nullable expressions are not valid in the limit clause

It was tested with unit and e2e tests.

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: Mauro Palsgraaf <[email protected]>

Closes #21807 from mauropalsgraaf/SPARK-24536.

(cherry picked from commit 4ac2126)
Signed-off-by: Xiao Li <[email protected]>
@gatorsmile
Copy link
Member

@mauropalsgraaf Do you have a JIRA ID? I can assign the ticket to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants