SPARK-1839: PySpark RDD#take() shouldn't always read from driver #922

aarondav · 2014-05-30T17:05:53Z

This patch simply ports over the Scala implementation of RDD#take(), which reads the first partition at the driver, then decides how many more partitions it needs to read and will possibly start a real job if it's more than 1. (Note that SparkContext#runJob(allowLocal=true) only runs the job locally if there's 1 partition selected and no parent stages.)

AmplabJenkins · 2014-05-30T17:07:58Z

Merged build triggered.

AmplabJenkins · 2014-05-30T17:08:06Z

Merged build started.

AmplabJenkins · 2014-05-30T17:48:24Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-30T17:48:24Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15307/

rxin · 2014-05-31T20:04:47Z

Thanks. I merged this in master.

This patch simply ports over the Scala implementation of RDD#take(), which reads the first partition at the driver, then decides how many more partitions it needs to read and will possibly start a real job if it's more than 1. (Note that SparkContext#runJob(allowLocal=true) only runs the job locally if there's 1 partition selected and no parent stages.) Author: Aaron Davidson <[email protected]> Closes apache#922 from aarondav/take and squashes the following commits: fa06df9 [Aaron Davidson] SPARK-1839: PySpark RDD#take() shouldn't always read from driver

…che#922)" This reverts commit 8fff2f3.

* Add Iceberg as a dep * rdar://70004937 Rewrite row-level operations for Iceberg (apache#922) * rdar://72811621 Control distribution and ordering during table creation (apache#939) This PR adds support for defining distribution and ordering during table creation. This change is needed for feature parity with ADT in Spark 2. This PR adds new optional table creation clauses but it should only affect Iceberg users. This PR comes with new tests. More tests are in Iceberg.

asfgit closed this in 9909efc May 31, 2014

flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021

rdar://70004937 Rewrite row-level operations for Iceberg (apache#922)

8fff2f3

flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021

Revert "rdar://70004937 Rewrite row-level operations for Iceberg (apa…

89690dd

…che#922)" This reverts commit 8fff2f3.

wangyum pushed a commit that referenced this pull request May 26, 2023

[CARMEL-5779] Compact Auto Scheduling (#922)

45da4c1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SPARK-1839: PySpark RDD#take() shouldn't always read from driver #922

SPARK-1839: PySpark RDD#take() shouldn't always read from driver #922

Uh oh!

aarondav commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

rxin commented May 31, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SPARK-1839: PySpark RDD#take() shouldn't always read from driver #922

SPARK-1839: PySpark RDD#take() shouldn't always read from driver #922

Uh oh!

Conversation

aarondav commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

AmplabJenkins commented May 30, 2014

Uh oh!

rxin commented May 31, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants