Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Nov 20, 2016

What changes were proposed in this pull request?

This pr is to merge unnecessary partial aggregates if the inputs of aggregates satisfy the distribution requirement of these partial aggregates. This pr is rework based on the @cloud-fan 's suggestion in #14909.

How was this patch tested?

Add tests in PlannerSuite to check if these partial aggregates are removed by catalyst.

@SparkQA
Copy link

SparkQA commented Nov 20, 2016

Test build #68901 has finished for PR 15945 at commit 2633ced.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class QueryExecution(val sparkSession: SparkSession, val logical: LogicalPlan)
    • abstract class AggregateExec extends UnaryExecNode

@maropu
Copy link
Member Author

maropu commented Nov 21, 2016

@hvanhovell @cloud-fan I think this target might be 2.2.0, so could you check this after 2.1 is cut. Thanks!

@maropu
Copy link
Member Author

maropu commented Jan 12, 2017

@hvanhovell @cloud-fan Could you check this? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we add a PhysicalOptimizer to do these things? then we can simply write lazy val executedPlan: SparkPlan = physicalOptimizer.execute(sparkPlan)) instead of lazy val executedPlan: SparkPlan = prepareForExecution(sparkPlan)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, good idea, so I try to do so.

@SparkQA
Copy link

SparkQA commented Jan 12, 2017

Test build #71262 has finished for PR 15945 at commit da25be6.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class PhysicalOptimizer(sparkSession: SparkSession) extends RuleExecutor[SparkPlan]
  • class QueryExecution(val sparkSession: SparkSession, val logical: LogicalPlan)

@SparkQA
Copy link

SparkQA commented Jan 12, 2017

Test build #71264 has finished for PR 15945 at commit 6bd225f.

  • This patch fails Scala style tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):
  • class PhysicalOptimizer(sparkSession: SparkSession) extends RuleExecutor[SparkPlan]
  • class QueryExecution(val sparkSession: SparkSession, val logical: LogicalPlan)

@maropu maropu force-pushed the SPARK-12978 branch 2 times, most recently from 0a12a4f to 30e7258 Compare January 12, 2017 13:55
@SparkQA
Copy link

SparkQA commented Jan 12, 2017

Test build #71265 has finished for PR 15945 at commit 30e7258.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class PhysicalOptimizer(sparkSession: SparkSession) extends RuleExecutor[SparkPlan]
  • class QueryExecution(val sparkSession: SparkSession, val logical: LogicalPlan)

@SparkQA
Copy link

SparkQA commented Jan 12, 2017

Test build #71267 has finished for PR 15945 at commit 96d0723.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class PhysicalOptimizer(sparkSession: SparkSession) extends RuleExecutor[SparkPlan]
  • class QueryExecution(val sparkSession: SparkSession, val logical: LogicalPlan)

@maropu
Copy link
Member Author

maropu commented Jan 13, 2017

@cloud-fan How about this fix?

@maropu
Copy link
Member Author

maropu commented Jan 15, 2017

@cloud-fan ping

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's think about a better name, it does more than only optimization

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yea. So, how about PhysicalPlanRewriter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we put it in SessionState like analyzer and optimizer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, SGTM. I'll try to fix

@SparkQA
Copy link

SparkQA commented Jan 20, 2017

Test build #71724 has finished for PR 15945 at commit 636d022.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 20, 2017

Test build #71727 has finished for PR 15945 at commit bea519f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 24, 2017

Test build #71917 has started for PR 15945 at commit bea519f.

@maropu
Copy link
Member Author

maropu commented Jan 24, 2017

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jan 24, 2017

Test build #71926 has finished for PR 15945 at commit bea519f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72351 has finished for PR 15945 at commit c886d26.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72355 has finished for PR 15945 at commit 8c6ab3e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2017

Test build #72357 has finished for PR 15945 at commit d333ca3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Feb 4, 2017

@cloud-fan ping

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we return an anonymous RuleExecutor[SparkPlan] here? then we don't need to bother the name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I'll try to fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, if we use anonymous classes here, it seems we cannot avoid duplicate rule entries in IncrementalExecution: 4f1240d#diff-13a3f1b22cd7c812e433f771d39eec97R103.
I keep looking for other approaches to avoid this though, I would appreciate your more suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it needed? UnaryExecNode already extends SparkPlan

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, you're right and this is meaningless. I'll remove this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not outer.getClass == inner.getClass?

@SparkQA
Copy link

SparkQA commented Feb 11, 2017

Test build #72738 has finished for PR 15945 at commit 149f277.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 11, 2017

Test build #72739 has finished for PR 15945 at commit 4f1240d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu maropu force-pushed the SPARK-12978 branch 2 times, most recently from 8e5d522 to ea586cf Compare March 15, 2017 00:47
@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74571 has finished for PR 15945 at commit 8e5d522.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74573 has finished for PR 15945 at commit ea586cf.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74577 has finished for PR 15945 at commit 870222e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74580 has finished for PR 15945 at commit 11d2757.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jul 18, 2018

I'll close for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants