-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-21746][SQL]there is an java.lang.IllegalArgumentException when the filter contains nondeterminate expressions #18961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
|
Test build #80810 has finished for PR 18961 at commit
|
|
@cloud-fan @gatorsmile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to turn off this? It looks irrelevant to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha,you are right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 is not a partitionIndex and 0 will go to n.initialize(0) in line 46. Is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we not need InterpretedPredicate.initialize ? and modify object InterpretedPredicate.create
def create(expression: Expression): InterpretedPredicate = {
expression.foreach {
case n: Nondeterministic => n.initialize(0)
case _ =>
}
new InterpretedPredicate(expression)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The consideration was the consistency of class InterpretedPredicate, so add initialize method of the class InterpretedPredicate.
|
Will review it tonight. |
|
Test build #81011 has finished for PR 18961 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this test case will trigger InterpretedPredicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predicates is not empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I tried to validate the spark 2.0.2 version, and it won't trigger InterpretedPredicate Exception.
spark 2.0.2 InterpretedPredicate.create
def create(expression: Expression): (InternalRow => Boolean) = {
expression.foreach {
case n: Nondeterministic => n.setInitialValues()
case _ =>
}
(r: InternalRow) => expression.eval(r).asInstanceOf[Boolean]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran this test without any exception. Are you sure this test can reproduce this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, this exception comes from #18918 Unit testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test you add should trigger InterpretedPredicate as @cloud-fan mentioned. It should hit the code path you modified in this PR, not depending on another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry,I update the description of PR. thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this test expose the bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm updating to the latest version for validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viirya, @cloud-fan
This trigger condition is associated with #18918. It will be more prone to this exception
The current spark master branch does not trigger this code path.
put this change on #18918 and close this PR.
thanks.
|
@dongjoon-hyun @cloud-fan Do you have any suggestions? |
|
Good catch! I think we do need |
|
@cloud-fan, Thank you for your suggest. I found 4 calls to InterpretedPredicate.create and modify it. Can you take a look again if you have time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The partition index is not always 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the few places calling newPredicate. They are either already do initialize properly, or don't need to do it. So I think we don't need to do initialize here like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The predicate here can only be sources.GreaterThan. It can't include non-deterministic expressions.
|
Test build #81203 has finished for PR 18961 at commit
|
|
Test build #81210 has finished for PR 18961 at commit
|
|
Any reason you closed it? |
|
You do not need to open multiple PRs for the same issue. |
|
@gatorsmile , This should be a problem for code execution, semantics, and consistency. PartitionFilters is [] , instead of [(rand(10) <= 0.5)]. Thus, this trigger condition is associated with #18918. it will be more prone to this exception. PartitionFilters is [(rand(10) <= 0.5)] |
What changes were proposed in this pull request?
Currently, We do interpretedpredicate optimization, but not very well, because when our filter contained an nondeterminate expression.
in spark-shell. execute the following SQL statement:
Spark throws an exceptions java.lang.IllegalArgumentException:
This PR describes solving this problem by adding the initialize method in InterpretedPredicate.
How was this patch tested?
Should be covered existing test cases and add new test cases.