Skip to content

Conversation

@aarondav
Copy link
Contributor

This patch simply ports over the Scala implementation of RDD#take(), which reads the first partition at the driver, then decides how many more partitions it needs to read and will possibly start a real job if it's more than 1. (Note that SparkContext#runJob(allowLocal=true) only runs the job locally if there's 1 partition selected and no parent stages.)

This patch simply ports over the Scala implementation of RDD#take(),
which reads the first partition at the driver, then decides how many
more partitions it needs to read and will possibly start a real job
if it's more than 1. (Note that SparkContext#runJob(allowLocal=true)
only runs the job locally if there's 1 partition selected and no
parent stages.)
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15307/

@rxin
Copy link
Contributor

rxin commented May 31, 2014

Thanks. I merged this in master.

@asfgit asfgit closed this in 9909efc May 31, 2014
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
This patch simply ports over the Scala implementation of RDD#take(), which reads the first partition at the driver, then decides how many more partitions it needs to read and will possibly start a real job if it's more than 1. (Note that SparkContext#runJob(allowLocal=true) only runs the job locally if there's 1 partition selected and no parent stages.)

Author: Aaron Davidson <[email protected]>

Closes apache#922 from aarondav/take and squashes the following commits:

fa06df9 [Aaron Davidson] SPARK-1839: PySpark RDD#take() shouldn't always read from driver
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
This patch simply ports over the Scala implementation of RDD#take(), which reads the first partition at the driver, then decides how many more partitions it needs to read and will possibly start a real job if it's more than 1. (Note that SparkContext#runJob(allowLocal=true) only runs the job locally if there's 1 partition selected and no parent stages.)

Author: Aaron Davidson <[email protected]>

Closes apache#922 from aarondav/take and squashes the following commits:

fa06df9 [Aaron Davidson] SPARK-1839: PySpark RDD#take() shouldn't always read from driver
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
* Add Iceberg as a dep

* rdar://70004937 Rewrite row-level operations for Iceberg (apache#922)

* rdar://72811621 Control distribution and ordering during table creation (apache#939)

This PR adds support for defining distribution and ordering during table creation.

This change is needed for feature parity with ADT in Spark 2.

This PR adds new optional table creation clauses but it should only affect Iceberg users.

This PR comes with new tests. More tests are in Iceberg.
wangyum pushed a commit that referenced this pull request May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants