Skip to content

Conversation

@ptkool
Copy link
Contributor

@ptkool ptkool commented May 8, 2017

What changes were proposed in this pull request?

Add new optimization rule to eliminate unnecessary shuffling by flipping adjacent Window expressions.

How was this patch tested?

Tested with unit tests, integration tests, and manual tests.

@gatorsmile
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented May 8, 2017

Test build #76591 has finished for PR 17899 at commit f90c857.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gatorsmile gatorsmile May 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition might not be enough. w1 might depend on the outputs of w2, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are also changing the order of the columns. You will need to add a projection on top to be sure.

@SparkQA
Copy link

SparkQA commented May 9, 2017

Test build #76678 has finished for PR 17899 at commit 300edf0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 9, 2017

Test build #76679 has finished for PR 17899 at commit eb2def2.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 9, 2017

Test build #76680 has finished for PR 17899 at commit f6a4e47.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ptkool ptkool force-pushed the adjacent_window_optimization branch from f6a4e47 to 1ab81ca Compare May 9, 2017 18:38
@SparkQA
Copy link

SparkQA commented May 9, 2017

Test build #76690 has finished for PR 17899 at commit 1ab81ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

retest this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably warrants a follow-up that tries to move projections that are wedged in between two window clauses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to check that the windows are independent, e.g.: w1.references.intersect(w2.windowOutputSet).isEmpty

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to get a little more milage out of the rule by using semanticEquals for comparing the partition expressions, e.g.:

def sliceSemanticEquals(ps1: Seq[Expression], ps2: Seq[Expression]): Boolean = ps1.zip(ps2).forall {
  case (l, r) => l.semanticEquals(r)
}
...
sliceSemanticEquals(ps1, ps2)

You could even get more leverage if you do not consider the order of the partition spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this test? It does not really add anything new.

Copy link
Contributor Author

@ptkool ptkool May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparePlans(optimized, analyzed)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure if we need this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll remove it.

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptkool this looks pretty good. One thing need to be addressed though: we need to factor in that a parent window can depend on its child window.

@SparkQA
Copy link

SparkQA commented May 16, 2017

Test build #76970 has finished for PR 17899 at commit 1ab81ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ptkool ptkool force-pushed the adjacent_window_optimization branch from 1ab81ca to f472bfe Compare May 20, 2017 12:37
@SparkQA
Copy link

SparkQA commented May 20, 2017

Test build #77126 has finished for PR 17899 at commit f472bfe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ptkool
Copy link
Contributor Author

ptkool commented Jun 3, 2017

@hvanhovell @gatorsmile Can you have another look at this?

@ptkool ptkool changed the title [SPARK-20636] Add new optimization rule to flip adjacent Window expressions. [SPARK-20636] Add new optimization rule to transpose adjacent Window expressions. Jun 7, 2017
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test case covers the condition w1.references.intersect(w2.windowOutputSet).isEmpty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add one.

Copy link
Member

@gatorsmile gatorsmile Jun 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expressions in both w1.expressions and w2.expressions must be deterministic. If not, we should not flip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This seems overly restrictive to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to ensure the results are still the same with and without the rule.

@ptkool ptkool force-pushed the adjacent_window_optimization branch from f472bfe to e2f24c2 Compare June 30, 2017 09:31
@SparkQA
Copy link

SparkQA commented Jun 30, 2017

Test build #78973 has finished for PR 17899 at commit e2f24c2.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

@ptkool Can you address the conflicts? We will review it.

@SparkQA
Copy link

SparkQA commented Oct 29, 2017

Test build #83189 has finished for PR 17899 at commit 54d88fa.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2017

Test build #83191 has finished for PR 17899 at commit 82d7390.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ptkool ptkool force-pushed the adjacent_window_optimization branch from 82d7390 to f840c69 Compare October 29, 2017 12:43
@SparkQA
Copy link

SparkQA commented Oct 29, 2017

Test build #83193 has finished for PR 17899 at commit f840c69.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2017

Test build #83194 has finished for PR 17899 at commit e9f6928.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 19, 2018

Test build #86355 has finished for PR 17899 at commit 72a4b3a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 6, 2018

Test build #95753 has finished for PR 17899 at commit 94e8115.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

object CollapseWindow extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
case w1 @ Window(we1, ps1, os1, w2 @ Window(we2, ps2, os2, grandChild))
if ps1 == ps2 && os1 == os2 && w1.references.intersect(w2.windowOutputSet).isEmpty &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't include style changes

@SparkQA
Copy link

SparkQA commented Sep 7, 2018

Test build #95802 has finished for PR 17899 at commit 4328831.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks! Merged to master.

}

/**
* Transpose Adjacent Window Expressions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this rule useful?

holdenk pushed a commit to holdenk/spark that referenced this pull request Jan 5, 2019
## What changes were proposed in this pull request?

This PR is a follow-up of the PR apache#17899. It is to add the rule TransposeWindow the optimizer batch.

## How was this patch tested?
The existing tests.

Closes apache#23222 from gatorsmile/followupSPARK-20636.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

This PR is a follow-up of the PR apache#17899. It is to add the rule TransposeWindow the optimizer batch.

## How was this patch tested?
The existing tests.

Closes apache#23222 from gatorsmile/followupSPARK-20636.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
@ptkool ptkool deleted the adjacent_window_optimization branch January 18, 2020 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants