Skip to content

Conversation

@brkyvz
Copy link
Contributor

@brkyvz brkyvz commented Apr 28, 2015

Coalesce and repartition now show up as part of the query plan, rather than resulting in a new DataFrame.

cc @rxin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import order

@rxin
Copy link
Contributor

rxin commented Apr 28, 2015

LGTM otherwise

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31191 has finished for PR 5762 at commit f2e6af1.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Coalesce(numPartitions: Int, shuffle: Boolean, child: LogicalPlan) extends UnaryNode
    • case class Coalesce(numPartitions: Int, shuffle: Boolean, child: SparkPlan) extends UnaryNode
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31192 has finished for PR 5762 at commit 2c349b5.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Coalesce(numPartitions: Int, shuffle: Boolean, child: LogicalPlan) extends UnaryNode
    • case class Coalesce(numPartitions: Int, shuffle: Boolean, child: SparkPlan) extends UnaryNode
  • This patch does not change any dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin Should I rename this to Repartition? There are a lot of conflicts coming from catalyst and sql. In fact, the Coalesce function in catalyst fits it's usage, which is to combine (elements) in a mass or whole.. Here, we are basically repartitioning the dataset. Coalesce with a higher number of partitions sounds weird. Also it might be weird to have two different types of Coalesce. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea repartition sounds better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, or repartition sounds even better than CoalescePartitions

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31215 has finished for PR 5762 at commit fa4509f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CoalescePartitions(numPartitions: Int, shuffle: Boolean, child: LogicalPlan)
    • case class CoalescePartitions(numPartitions: Int, shuffle: Boolean, child: SparkPlan)
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31234 has finished for PR 5762 at commit b1e76dd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Repartition(numPartitions: Int, shuffle: Boolean, child: LogicalPlan)
    • case class RepartitionByExpression(partitionExpressions: Seq[Expression], child: LogicalPlan)
    • case class Repartition(numPartitions: Int, shuffle: Boolean, child: SparkPlan)
  • This patch does not change any dependencies.

@rxin
Copy link
Contributor

rxin commented Apr 29, 2015

Thanks. I've merged this in master.

@asfgit asfgit closed this in 271c4c6 Apr 29, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
Coalesce and repartition now show up as part of the query plan, rather than resulting in a new `DataFrame`.

cc rxin

Author: Burak Yavuz <[email protected]>

Closes apache#5762 from brkyvz/df-repartition and squashes the following commits:

b1e76dd [Burak Yavuz] added documentation on repartitions
5807e35 [Burak Yavuz] renamed coalescepartitions
fa4509f [Burak Yavuz] rename coalesce
2c349b5 [Burak Yavuz] address comments
f2e6af1 [Burak Yavuz] add ticks
686c90b [Burak Yavuz] made coalesce and repartition a part of the query plan
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
Coalesce and repartition now show up as part of the query plan, rather than resulting in a new `DataFrame`.

cc rxin

Author: Burak Yavuz <[email protected]>

Closes apache#5762 from brkyvz/df-repartition and squashes the following commits:

b1e76dd [Burak Yavuz] added documentation on repartitions
5807e35 [Burak Yavuz] renamed coalescepartitions
fa4509f [Burak Yavuz] rename coalesce
2c349b5 [Burak Yavuz] address comments
f2e6af1 [Burak Yavuz] add ticks
686c90b [Burak Yavuz] made coalesce and repartition a part of the query plan
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
Coalesce and repartition now show up as part of the query plan, rather than resulting in a new `DataFrame`.

cc rxin

Author: Burak Yavuz <[email protected]>

Closes apache#5762 from brkyvz/df-repartition and squashes the following commits:

b1e76dd [Burak Yavuz] added documentation on repartitions
5807e35 [Burak Yavuz] renamed coalescepartitions
fa4509f [Burak Yavuz] rename coalesce
2c349b5 [Burak Yavuz] address comments
f2e6af1 [Burak Yavuz] add ticks
686c90b [Burak Yavuz] made coalesce and repartition a part of the query plan
@brkyvz brkyvz deleted the df-repartition branch February 3, 2019 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants