Skip to content

Conversation

@wzhfy
Copy link
Contributor

@wzhfy wzhfy commented Apr 18, 2017

What changes were proposed in this pull request?

If a plan has multi-level successive joins, e.g.:

         Join
         /   \
     Union   t5
      /   \
    Join  t4
    /   \
  Join  t3
  /  \
 t1   t2

Currently we fail to reorder the inside joins, i.e. t1, t2, t3.

In join reorder, we use OrderedJoin to indicate a join has been ordered, such that when transforming down the plan, these joins don't need to be rerodered again.

But there's a problem in the definition of OrderedJoin:
The real join node is a parameter, but not a child. This breaks the transform procedure because mapChildren applies transform function on parameters which should be children.

In this patch, we change OrderedJoin to a class having the same structure as a join node.

How was this patch tested?

Add a corresponding test case.

@wzhfy
Copy link
Contributor Author

wzhfy commented Apr 18, 2017

cc @cloud-fan @hvanhovell

@SparkQA
Copy link

SparkQA commented Apr 18, 2017

Test build #75891 has finished for PR 17668 at commit 522d2fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class OrderedJoin(

@cloud-fan
Copy link
Contributor

LGTM, merging to master!

@asfgit asfgit closed this in 321b4f0 Apr 18, 2017
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…t reordered

## What changes were proposed in this pull request?

If a plan has multi-level successive joins, e.g.:
```
         Join
         /   \
     Union   t5
      /   \
    Join  t4
    /   \
  Join  t3
  /  \
 t1   t2
```
Currently we fail to reorder the inside joins, i.e. t1, t2, t3.

In join reorder, we use `OrderedJoin` to indicate a join has been ordered, such that when transforming down the plan, these joins don't need to be rerodered again.

But there's a problem in the definition of `OrderedJoin`:
The real join node is a parameter, but not a child. This breaks the transform procedure because `mapChildren` applies transform function on parameters which should be children.

In this patch, we change `OrderedJoin` to a class having the same structure as a join node.

## How was this patch tested?

Add a corresponding test case.

Author: wangzhenhua <[email protected]>

Closes apache#17668 from wzhfy/recursiveReorder.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants