Skip to content

Conversation

@sryza
Copy link
Contributor

@sryza sryza commented Nov 5, 2014

No description provided.

@SparkQA
Copy link

SparkQA commented Nov 5, 2014

Test build #22921 has started for PR 3107 at commit 14ca79b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 5, 2014

Test build #22921 has finished for PR 3107 at commit 14ca79b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22921/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say "number of shuffle partitions" - it's slightly weird to me to say "output" when this refers to something that is totally internal to Spark - it's output on the map side but input on he read side. In other cases I think output tends to mean things like saving as HDFS data, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that Spark's APIs have no mention of the concept of a "shuffle partition" (e.g. the term is referenced nowhere on https://spark.apache.org/docs/latest/programming-guide.html), but even novice Spark users are meant to understand that every transformation has input and output RDDs and that every RDD has a number of partitions.

Maybe "Default number of partitions for the RDDs produced by operations like ..."?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see - what about "Default number of partitions in RDD's returned by join, reduceByKey..."

@pwendell
Copy link
Contributor

pwendell commented Nov 9, 2014

Had some minor wording questions.

@SparkQA
Copy link

SparkQA commented Nov 10, 2014

Test build #23129 has started for PR 3107 at commit 37a1d19.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 10, 2014

Test build #23129 has finished for PR 3107 at commit 37a1d19.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23129/
Test FAILed.

@sryza
Copy link
Contributor Author

sryza commented Nov 10, 2014

Test failure looks unrelated

@pwendell
Copy link
Contributor

LG - pulling it in.

@asfgit asfgit closed this in c6f4e70 Nov 10, 2014
asfgit pushed a commit that referenced this pull request Nov 10, 2014
Author: Sandy Ryza <[email protected]>

Closes #3107 from sryza/sandy-spark-4230 and squashes the following commits:

37a1d19 [Sandy Ryza] Clear up a couple things
34d53de [Sandy Ryza] SPARK-4230. Doc for spark.default.parallelism is incorrect

(cherry picked from commit c6f4e70)
Signed-off-by: Patrick Wendell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants