[SPARK-18223] [CORE] Optimise PartitionedAppendOnlyMap implementation #15735
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This class, like the PartitionedPairBuffer class, are both core Spark data structures that allow us to spill data to disk.
From the comment in ExternalSorter before instantiating said data structures:
// Data structures to store in-memory objects before we spill. Depending on whether we have an
// Aggregator set, we either put objects into an AppendOnlyMap where we combine them, or we
// store them in an array buffer.
All of our data within RDDs has a partition ID and the ordering operations will order by a partition before any other criteria. Such data structures share a partitionKeyComparator from WriteablePartitionedPairCollection.
While this change adds more code, it is the bad iterator wrapping we remove that has a negative performance impact. In this case we avoid said wrapping to help the inliner. When avoided we've observed a 3% PageRank performance increase on HiBench large for both IBM's SDK for Java and OpenJDK 8 as a result of the inliner being better able to figure out what's going on. This observation is seen when combined with an optimisation PartitionedPairBuffer implementation I'll also contribute.
How was this patch tested?
Existing unit tests and HiBench large, PageRank benchmark specifically.