Skip to content

Conversation

@kayousterhout
Copy link
Contributor

@mateiz was excluding the time to write this final file from the shuffle write time intentional?

@sryza
Copy link
Contributor

sryza commented Feb 12, 2015

The shuffle write time is meant to be measured in nanoseconds, right?

Edited to remove my second complaint which didn't make sense.

@kayousterhout
Copy link
Contributor Author

Ah thanks for noticing that -- fixed! And yeah the other case works fine because it goes through the DiskBlockObjectWriter (sounds like you figured this out).

@sryza
Copy link
Contributor

sryza commented Feb 12, 2015

My other concern is that we're also measuring the time spent reading the spill files from disk. Though I suppose there's no way around this because we use transferTo? If others are OK with that, then LGTM.

@SparkQA
Copy link

SparkQA commented Feb 12, 2015

Test build #27337 has finished for PR 4559 at commit 5a59906.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 12, 2015

Test build #27341 has finished for PR 4559 at commit d773276.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mateiz
Copy link
Contributor

mateiz commented Feb 12, 2015

I think this was just an oversight, it's good to include that time. The change looks good to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we write this metric if there was an exception thrown above? if so, maybe we should add this in the finally block. If not, then this is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point -- done.

@ksakellis
Copy link

LGTM, @sryza I think its okay to include the reading time since this whole operation you can argue is for writing out the file - even if there is some reading.

@kayousterhout
Copy link
Contributor Author

I had the same thought as @ksakellis re:reading time. Thanks for looking at this all!

@SparkQA
Copy link

SparkQA commented Feb 12, 2015

Test build #27366 has finished for PR 4559 at commit ace156c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 12, 2015

Test build #27369 has finished for PR 4559 at commit 94e4237.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kayousterhout
Copy link
Contributor Author

Jenkins, retest this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be foreach since we don't use the return value (not a big deal at all)

@SparkQA
Copy link

SparkQA commented Feb 12, 2015

Test build #27374 has finished for PR 4559 at commit 94e4237.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 12, 2015

Test build #27375 has finished for PR 4559 at commit 5c6f3d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

Ok merging into master 1.3 1.2 thanks @kayousterhout

asfgit pushed a commit that referenced this pull request Feb 12, 2015
mateiz was excluding the time to write this final file from the shuffle write time intentional?

Author: Kay Ousterhout <[email protected]>

Closes #4559 from kayousterhout/SPARK-5762 and squashes the following commits:

5c6f3d9 [Kay Ousterhout] Use foreach
94e4237 [Kay Ousterhout] Removed open time metrics added inadvertently
ace156c [Kay Ousterhout] Moved metrics to finally block
d773276 [Kay Ousterhout] Use nano time
5a59906 [Kay Ousterhout] [SPARK-5762] Fix shuffle write time for sort-based shuffle

(cherry picked from commit 47c73d4)
Signed-off-by: Andrew Or <[email protected]>
asfgit pushed a commit that referenced this pull request Feb 12, 2015
mateiz was excluding the time to write this final file from the shuffle write time intentional?

Author: Kay Ousterhout <[email protected]>

Closes #4559 from kayousterhout/SPARK-5762 and squashes the following commits:

5c6f3d9 [Kay Ousterhout] Use foreach
94e4237 [Kay Ousterhout] Removed open time metrics added inadvertently
ace156c [Kay Ousterhout] Moved metrics to finally block
d773276 [Kay Ousterhout] Use nano time
5a59906 [Kay Ousterhout] [SPARK-5762] Fix shuffle write time for sort-based shuffle

(cherry picked from commit 47c73d4)
Signed-off-by: Andrew Or <[email protected]>
@asfgit asfgit closed this in 47c73d4 Feb 12, 2015
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Feb 24, 2015
mateiz was excluding the time to write this final file from the shuffle write time intentional?

Author: Kay Ousterhout <[email protected]>

Closes apache#4559 from kayousterhout/SPARK-5762 and squashes the following commits:

5c6f3d9 [Kay Ousterhout] Use foreach
94e4237 [Kay Ousterhout] Removed open time metrics added inadvertently
ace156c [Kay Ousterhout] Moved metrics to finally block
d773276 [Kay Ousterhout] Use nano time
5a59906 [Kay Ousterhout] [SPARK-5762] Fix shuffle write time for sort-based shuffle

(cherry picked from commit 47c73d4)
Signed-off-by: Andrew Or <[email protected]>
@kayousterhout kayousterhout deleted the SPARK-5762 branch April 12, 2017 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants