Skip to content

Conversation

@antonoal
Copy link

What changes were proposed in this pull request?

A relatively simple transformation is missing from Catalyst's arsenal - generation of transitive predicates. For instance, if you have got the following query:
select * from table1 t1 join table2 t2 on t1.a = t2.b where t1.a = 42
then it is a fair assumption that t2.b also equals 42 hence an additional predicate could be generated. The additional predicate could in turn be pushed down through the join and improve performance of the whole query by filtering out the data before joining it.
Such a transformation exists in Oracle DB.
Please note, in this PR a transitive predicate would be created for the following operations:

  • a BinaryComparison (=, >=, etc.) to a foldable
  • in (1, 2, 3) where all the values in the sequence are foldable
  • Not of any of the above
  • Or of any of the above

How was this patch tested?

I've added a new TransitiveClosureSuite with a series of unit tests

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@rxin
Copy link
Contributor

rxin commented Mar 18, 2016

This is similar to #11618 isn't it?

@antonoal
Copy link
Author

Yes and no. This change handles a bit more cases, not just column == constant, and it works on an earlier phase - logical plan optimisation rather than physical. It is a question whether the second point is better or worse in my change.

@sameeragarwal
Copy link
Member

@antonoal Thanks a lot for looking into this! As @rxin pointed out, we currently infer these transitive predicates by looking at the data constraints for each operator in the logical plan (please also see #11665). Can you please replace your TransitiveClosure rule with InferFiltersFromConstraints rule in TransitiveClosureSuite.scala and see if it handles all your test cases?

@antonoal
Copy link
Author

It does cover all my tests and looks a lot neater, so feel free to decline this PR.
Also do you know of the top of your head if there is a jira for converting an outer join into an inner if there is a filter on a column from the right table following the join? That's another thing I was thinking to create a PR, but if someone doing it it would obviously be pointless

@srowen
Copy link
Member

srowen commented Mar 19, 2016

@antonoal antonoal closed this Mar 19, 2016
@sameeragarwal
Copy link
Member

yes, the OuterJoinElimination rule in catalyst generalizes that and converts outer joins to either inner, left-outer or right-outer based on the filter conditions (#10567).

@gatorsmile
Copy link
Member

Yeah, we have another PR: #10566. That PR is waiting for the related PR merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants