-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-8336 Fix NullPointerException with functions.rand() #6793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can you file a JIRA for this? |
|
Jenkins, retest this please. |
|
can you please add a test case to prevent regression |
|
Mind telling me which suite the new test should be added to ? Thanks |
|
At first glance, none of the test suites under sql/catalyst/src/test//scala/org/apache/spark/sql seems proper for the new test. |
|
We should create a RandomSuite.scala in expressions, and add tests for that. Take a look at other suites in that package. |
|
I looked at UnsafeFixedWidthAggregationMapSuite.scala in expressions package. Is RandomSuite.scala going to test Rand and Randn only ? A bit more hint is appreciated. |
|
Ok you managed to pick one suite that isn't a good example. Take a look at https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConditionalExpressionSuite.scala basically use checkEvaluation function. |
|
I am trying to figure out how checkEvaluation should be used for the new test. protected def checkEvaluation( w.r.t. Rand(), the expected value is not deterministic. |
|
Looking at ArithmeticExpressionSuite.scala, it has some checks in the following form: This seems to be better fit for checking the return value from Rand() |
|
Don't we have some way of setting the RNG seed for testing? |
|
Test build #34839 has finished for PR 6793 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we create a new test case, instead of adding it to the existing one?
I've been meaning to take the existing one apart for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should have a case where we explicitly set taskcontext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the tests under sql, I don't see how TaskContext is explicitly set.
Creating a new test is fine. The new test would contain a method containing one line.
Just want to make sure this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in Beijing now.
Except for difficulty of accessing gmail, github is quite slow as well.
|
Test build #34961 has finished for PR 6793 at commit
|
|
Thanks. Merging in master & branch-1.4. |
This PR fixes the problem reported by Justin Yip in the thread 'NullPointerException with functions.rand()'
Tested using spark-shell and verified that the following works:
sqlContext.createDataFrame(Seq((1,2), (3, 100))).withColumn("index", rand(30)).show()
Author: tedyu <[email protected]>
Closes #6793 from tedyu/master and squashes the following commits:
62fd97b [tedyu] Create RandomSuite
750f92c [tedyu] Add test for Rand() with seed
a1d66c5 [tedyu] Fix NullPointerException with functions.rand()
(cherry picked from commit 1a62d61)
Signed-off-by: Reynold Xin <[email protected]>
|
@rxin it looks like the branch-1.4 cherry-pick of this commit broke a unit test, because it relies on |
|
Here's the relevant bit of the log (from testing #6842): |
|
Jenkins only run against master. Do you mind submitting a fix against branch-1.4 for this? I will merge it. |
rxin this is the fix you requested for the break introduced by backporting #6793 Author: Punya Biswal <[email protected]> Closes #6850 from punya/feature/fix-backport-break and squashes the following commits: fdc3693 [Punya Biswal] Fix break introduced by backport
This PR fixes the problem reported by Justin Yip in the thread 'NullPointerException with functions.rand()'
Tested using spark-shell and verified that the following works:
sqlContext.createDataFrame(Seq((1,2), (3, 100))).withColumn("index", rand(30)).show()
Author: tedyu <[email protected]>
Closes apache#6793 from tedyu/master and squashes the following commits:
62fd97b [tedyu] Create RandomSuite
750f92c [tedyu] Add test for Rand() with seed
a1d66c5 [tedyu] Fix NullPointerException with functions.rand()
This PR fixes the problem reported by Justin Yip in the thread 'NullPointerException with functions.rand()'
Tested using spark-shell and verified that the following works:
sqlContext.createDataFrame(Seq((1,2), (3, 100))).withColumn("index", rand(30)).show()