[SPARK-7215] made coalesce and repartition a part of the query plan #5762

brkyvz · 2015-04-28T23:53:06Z

Coalesce and repartition now show up as part of the query plan, rather than resulting in a new DataFrame.

cc @rxin

rxin · 2015-04-28T23:54:51Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

import order

rxin · 2015-04-28T23:57:20Z

LGTM otherwise

SparkQA · 2015-04-29T00:02:37Z

Test build #31191 has finished for PR 5762 at commit f2e6af1.

This patch fails to build.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Coalesce(numPartitions: Int, shuffle: Boolean, child: LogicalPlan) extends UnaryNode
- case class Coalesce(numPartitions: Int, shuffle: Boolean, child: SparkPlan) extends UnaryNode
This patch does not change any dependencies.

SparkQA · 2015-04-29T00:07:57Z

Test build #31192 has finished for PR 5762 at commit 2c349b5.

This patch fails to build.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Coalesce(numPartitions: Int, shuffle: Boolean, child: LogicalPlan) extends UnaryNode
- case class Coalesce(numPartitions: Int, shuffle: Boolean, child: SparkPlan) extends UnaryNode
This patch does not change any dependencies.

brkyvz · 2015-04-29T00:14:39Z

sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala

@rxin Should I rename this to Repartition? There are a lot of conflicts coming from catalyst and sql. In fact, the Coalesce function in catalyst fits it's usage, which is to combine (elements) in a mass or whole.. Here, we are basically repartitioning the dataset. Coalesce with a higher number of partitions sounds weird. Also it might be weird to have two different types of Coalesce. What do you think?

Yea repartition sounds better.

Oh, or repartition sounds even better than CoalescePartitions

SparkQA · 2015-04-29T04:17:26Z

Test build #31215 has finished for PR 5762 at commit fa4509f.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class CoalescePartitions(numPartitions: Int, shuffle: Boolean, child: LogicalPlan)
- case class CoalescePartitions(numPartitions: Int, shuffle: Boolean, child: SparkPlan)
This patch does not change any dependencies.

SparkQA · 2015-04-29T05:41:58Z

Test build #31234 has finished for PR 5762 at commit b1e76dd.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Repartition(numPartitions: Int, shuffle: Boolean, child: LogicalPlan)
- case class RepartitionByExpression(partitionExpressions: Seq[Expression], child: LogicalPlan)
- case class Repartition(numPartitions: Int, shuffle: Boolean, child: SparkPlan)
This patch does not change any dependencies.

rxin · 2015-04-29T05:47:46Z

Thanks. I've merged this in master.

Coalesce and repartition now show up as part of the query plan, rather than resulting in a new `DataFrame`. cc rxin Author: Burak Yavuz <[email protected]> Closes apache#5762 from brkyvz/df-repartition and squashes the following commits: b1e76dd [Burak Yavuz] added documentation on repartitions 5807e35 [Burak Yavuz] renamed coalescepartitions fa4509f [Burak Yavuz] rename coalesce 2c349b5 [Burak Yavuz] address comments f2e6af1 [Burak Yavuz] add ticks 686c90b [Burak Yavuz] made coalesce and repartition a part of the query plan

brkyvz added 2 commits April 28, 2015 16:49

made coalesce and repartition a part of the query plan

686c90b

add ticks

f2e6af1

rxin reviewed Apr 28, 2015
View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala Outdated

Copy link

Contributor

rxin Apr 28, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import order

address comments

2c349b5

brkyvz reviewed Apr 29, 2015
View reviewed changes

brkyvz added 3 commits April 28, 2015 19:10

rename coalesce

fa4509f

renamed coalescepartitions

5807e35

added documentation on repartitions

b1e76dd

asfgit closed this in 271c4c6 Apr 29, 2015

brkyvz deleted the df-repartition branch February 3, 2019 20:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-7215] made coalesce and repartition a part of the query plan #5762

[SPARK-7215] made coalesce and repartition a part of the query plan #5762

Uh oh!

brkyvz commented Apr 28, 2015

Uh oh!

rxin Apr 28, 2015

Uh oh!

rxin commented Apr 28, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

brkyvz Apr 29, 2015

Uh oh!

rxin Apr 29, 2015

Uh oh!

marmbrus Apr 29, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

rxin commented Apr 29, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-7215] made coalesce and repartition a part of the query plan #5762

[SPARK-7215] made coalesce and repartition a part of the query plan #5762

Uh oh!

Conversation

brkyvz commented Apr 28, 2015

Uh oh!

rxin Apr 28, 2015

Choose a reason for hiding this comment

Uh oh!

rxin commented Apr 28, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

brkyvz Apr 29, 2015

Choose a reason for hiding this comment

Uh oh!

rxin Apr 29, 2015

Choose a reason for hiding this comment

Uh oh!

marmbrus Apr 29, 2015

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

SparkQA commented Apr 29, 2015

Uh oh!

rxin commented Apr 29, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants