Skip to content

Conversation

@sameeragarwal
Copy link
Member

What changes were proposed in this pull request?

This PR generalizes the NullFiltering optimizer rule in catalyst to InferFiltersFromConstraints that can automatically infer all relevant filters based on an operator's constraints while making sure of 2 things:

(a) no redundant filters are generated, and
(b) filters that do not contribute to any further optimizations are not generated.

How was this patch tested?

Extended all tests in InferFiltersFromConstraintsSuite (that were initially based on NullFilteringSuite to test filter inference in Filter and Join operators.

In particular the 2 tests ( single inner join with pre-existing filters: filter out values on either side and multiple inner joins: filter out values on all sides on equi-join keys attempts to highlight/test the real potential of this rule for join optimization.

@sameeragarwal
Copy link
Member Author

cc @nongli @yhuai @gatorsmile

@SparkQA
Copy link

SparkQA commented Mar 12, 2016

Test build #52963 has finished for PR 11665 at commit 308e93c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: join @ Join

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53051 has finished for PR 11665 at commit d7f8d34.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sameeragarwal sameeragarwal changed the title [SPARK-XXXX][SQL] Support for inferring filters from data constraints [SPARK-13871][SQL] Support for inferring filters from data constraints Mar 14, 2016
@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53101 has finished for PR 11665 at commit 4b0cf5f.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sameeragarwal
Copy link
Member Author

test this please

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53104 has finished for PR 11665 at commit 4b0cf5f.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sameeragarwal
Copy link
Member Author

test this please

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53113 has finished for PR 11665 at commit 4b0cf5f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule doesn't seem related to the comment above it (same for nullfiltering)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems like a clearer version of the first sentence in the object comment. I'd remove this here and replace the first sentence of the above comment.

@sameeragarwal
Copy link
Member Author

thanks, all comments addressed!

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53330 has finished for PR 11665 at commit 336a18e.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53331 has finished for PR 11665 at commit 92d935f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@nongli
Copy link
Contributor

nongli commented Mar 16, 2016

LGTM

roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
## What changes were proposed in this pull request?

This PR generalizes the `NullFiltering` optimizer rule in catalyst to `InferFiltersFromConstraints` that can automatically infer all relevant filters based on an operator's constraints while making sure of 2 things:

(a) no redundant filters are generated, and
(b) filters that do not contribute to any further optimizations are not generated.

## How was this patch tested?

Extended all tests in `InferFiltersFromConstraintsSuite` (that were initially based on `NullFilteringSuite` to test filter inference in `Filter` and `Join` operators.

In particular the 2 tests ( `single inner join with pre-existing filters: filter out values on either side` and `multiple inner joins: filter out values on all sides on equi-join keys` attempts to highlight/test the real potential of this rule for join optimization.

Author: Sameer Agarwal <[email protected]>

Closes apache#11665 from sameeragarwal/infer-filters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants