-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code #32389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code #32389
Conversation
|
Test build #138054 has finished for PR 32389 at commit
|
322593f to
c96a16e
Compare
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138055 has finished for PR 32389 at commit
|
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
|
This's a great refactor. I left some minor comments. Overall, looks good to me. |
59137e2 to
596c3dd
Compare
|
Thanks @Ngone51 for the suggestions! I put up two more commits. The first pretty directly answers your comments and has some other small fixes I noticed. The second one is a bit larger and was inspired by your comment about moving the helper methods from the It does reduce the size by another ~40 lines: |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138087 has finished for PR 32389 at commit
|
|
@Ngone51 do you have any more comments here? Thanks a lot for your comments so far! |
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be extended to support the case of multiple blockmanagers, e.g.,
val blocksByAddress = Seq[(BlockManagerId, Seq[(BlockId, Long, Int)])](
(localBmId, localBlocks.keys.map(blockId => (blockId, 1L, 0)).toSeq),
(remoteBmId, remoteBlocks.keys.map(blockId => (blockId, 1L, 1)).toSeq),
(hostLocalBmId, hostLocalBlocks.keys.map(blockId => (blockId, 1L, 1)).toSeq)
).toIteratorWe can pass in a Map[BlockManagerId, (blocks, size, mapIndex)] instead.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I focused on the single-BM case because it was simpler, but you're right that there was still a lot of common logic to be reduced. Actually, your suggestion made me realize that we only ever use this method (and the new multi-BM method I created) to create an iterator which is then passed to getShuffleIteratorWithDefaults, so I just made getShuffleIteratorWithDefaults directly accept a Map. I think it's very clean now, thank you for the suggestion!
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
596c3dd to
196cb06
Compare
|
Great comments @Ngone51 ! Pushed up a new set of commits addressing your comments. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138338 has finished for PR 32389 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I was oscillating between this being a Map vs Seq ... currently, a Map is fine based on how ShuffleBlockFetcherIterator is used ... but might be something we revisit in future.
mridulm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nits, really nice work @xkrogen !
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
b2abb87 to
63bf0c4
Compare
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138641 has finished for PR 32389 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otterc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Just a minor nit
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Outdated
Show resolved
Hide resolved
…especially instantiating the ShuffleBlockFetcherIterator
…ito syntax and pull out common calls to when(transfer.fetchBlocks) and verify(transfer, ...).fetchBlocks
…BlocksCount to verifyFetchBlocksInvocationCount. Restore helpful comments. Remove one redundant parameter override. Fix upa few minor issues such as unnecessary specification of very long types.
…sing it around in method signatures.
…dress which accepts a Map
…tly accept a map of block info
4fcc6be to
eea80f5
Compare
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #138690 has finished for PR 32389 at commit
|
Ngone51
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What changes were proposed in this pull request?
Introduce new shared methods to
ShuffleBlockFetcherIteratorSuiteto replace copy-pasted code. Use modern, Scala-like MockitoAnswersyntax.Why are the changes needed?
ShuffleFetcherBlockIteratorSuitehas tons of duplicate code, likespark/core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
Lines 172 to 185 in 0494dc9
Similarly but not as bad, there are many calls like the following
These changes result in about 10% reduction in both lines and characters in the file:
It also helps readability, e.g.:
Now I can clearly tell that
maxBytesInFlightis the main parameter we're interested in here.Does this PR introduce any user-facing change?
No, test only. There aren't even any behavior changes, just refactoring.
How was this patch tested?
Unit tests pass.