-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13871][SQL] Support for inferring filters from data constraints #11665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #52963 has finished for PR 11665 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: join @ Join
|
Test build #53051 has finished for PR 11665 at commit
|
|
Test build #53101 has finished for PR 11665 at commit
|
|
test this please |
|
Test build #53104 has finished for PR 11665 at commit
|
|
test this please |
|
Test build #53113 has finished for PR 11665 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule doesn't seem related to the comment above it (same for nullfiltering)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems like a clearer version of the first sentence in the object comment. I'd remove this here and replace the first sentence of the above comment.
e3d84ce to
336a18e
Compare
|
thanks, all comments addressed! |
336a18e to
92d935f
Compare
|
Test build #53330 has finished for PR 11665 at commit
|
|
Test build #53331 has finished for PR 11665 at commit
|
|
LGTM |
## What changes were proposed in this pull request? This PR generalizes the `NullFiltering` optimizer rule in catalyst to `InferFiltersFromConstraints` that can automatically infer all relevant filters based on an operator's constraints while making sure of 2 things: (a) no redundant filters are generated, and (b) filters that do not contribute to any further optimizations are not generated. ## How was this patch tested? Extended all tests in `InferFiltersFromConstraintsSuite` (that were initially based on `NullFilteringSuite` to test filter inference in `Filter` and `Join` operators. In particular the 2 tests ( `single inner join with pre-existing filters: filter out values on either side` and `multiple inner joins: filter out values on all sides on equi-join keys` attempts to highlight/test the real potential of this rule for join optimization. Author: Sameer Agarwal <[email protected]> Closes apache#11665 from sameeragarwal/infer-filters.
What changes were proposed in this pull request?
This PR generalizes the
NullFilteringoptimizer rule in catalyst toInferFiltersFromConstraintsthat can automatically infer all relevant filters based on an operator's constraints while making sure of 2 things:(a) no redundant filters are generated, and
(b) filters that do not contribute to any further optimizations are not generated.
How was this patch tested?
Extended all tests in
InferFiltersFromConstraintsSuite(that were initially based onNullFilteringSuiteto test filter inference inFilterandJoinoperators.In particular the 2 tests (
single inner join with pre-existing filters: filter out values on either sideandmultiple inner joins: filter out values on all sides on equi-join keysattempts to highlight/test the real potential of this rule for join optimization.