Skip to content

Conversation

@SereneAnt
Copy link
Contributor

What changes were proposed in this pull request?

Traversable.toMap changed to 'collections.breakOut', that eliminates intermediate tuple collection creation, see Stack Overflow article.

How was this patch tested?

Unit tests run.
No performance tests performed yet.

Please review http://spark.apache.org/contributing.html before opening a pull request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK that's what it is. Why is this different? the article you cited isn't comparing to toMap, and toMap doesn't make an intermediate collection.

Copy link
Contributor Author

@SereneAnt SereneAnt Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The principles are the same,
sources.zipWithIndex.map {...}' allocates a collection of tuples, .toMapthen iterates and converts them into the map (hehe). breakOut is the implementation of CanBuildFrom, the implicit parameter passed toTraversable.map` method. Once used, it allows populating newborn map directly, without intermediate collection of tuples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimization nerds have already used it in the spark code:

@SereneAnt SereneAnt force-pushed the performance_toMap-breakOut branch from d7b73b5 to 5abe060 Compare July 24, 2017 03:32
@SereneAnt SereneAnt force-pushed the performance_toMap-breakOut branch from 5abe060 to 0ae9cc5 Compare July 25, 2017 02:01
@SparkQA
Copy link

SparkQA commented Jul 25, 2017

Test build #3850 has finished for PR 18693 at commit 0ae9cc5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jul 25, 2017

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants