-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-32417] Fix flakyness of BlockManagerDecommissionIntegrationSuite #29226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
agrawaldevesh
wants to merge
8
commits into
apache:master
from
agrawaldevesh:block-manager-decom-flaky
Closed
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
4521083
Make the block manager decommissioning test be less flaky
dagrawal3409 b9fc636
Incorporate @attilapiros's excellent observation
dagrawal3409 38269f9
Remove some un-needed semaphores
dagrawal3409 1096932
Added 2 task start tests: One of which we expect to fail and one that…
dagrawal3409 c95aca8
Clean it up: Remove the negative test which we know will fail
dagrawal3409 8e96830
Tweak sleep times depending on when to decom
dagrawal3409 5809a23
closure cleaner fix
dagrawal3409 1b469fb
Accumulator didn't quite work out of the box but sort of works
dagrawal3409 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes against the purpose of this test: making sure that an executor with a running task that receives a decom has the block migrated. It is not migrating during decommissioning in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case, I don't follow how this is supposed to work in the first place. Consider the root cause: You decommission an executor while a (result) task is running. The block manager associated with the executor enters the decom state. The task tries to write the RDD persisted block. The block manager fails that because it is decom'd, thereby failing the task. The task reruns somewhere else and the job succeeds. Now you go to check if that executor has migrated the block: That check fails because the block was never written in the first place.
So I am relaxing the notion of "migrate during" to mean "migrate while the job is running". The question is "which executor" do you want to decommission and force a migration off ? Picking a wrong executor as above can mean that no migrations indeed happen because no real blocks were written (or written and thence discarded).
If that's the intent of the test then I think we need to change the block manager decommissioning (production) code to realize the intent.
Without this PR: This test is fundamentally racy and thus provides low signal. The race is between the task's end and the block manager's decommission: If the task ends successfully before the decom, the test passes. Otherwise the test fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think another way we could make sure the test covers what we want is to run a job repeatedly until all 3 executors come up.
The block manager (in decom state) does indeed refuses puts, but RDD computation on the executor goes through
getOrElseUpdatewhich immediately callsdoPutIteratorif there is not a cache hit before the iterator starts being computed. Since the check to see if the block manager is decommissioning occurs before the start of the computation, not at the end we want to ensure that block can be put (and then later migrated).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add a comment where we do the
isDecommissioningcheck to explain that it is intentionally done there so that we don't reject blocks which have already started computation. Do you think that would help?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this. This is what I see: (BM stands for BlockManager)
BM.doPut throws BlockSavedOnDecommissionedBlockManagerException if
isDecommissioning. And this fails the job. I am not sure what it should do instead off the top of my head since I am new to this codepath. But it is certainly not continuing on with the computation (as far as I can tell).If the intent of the test is indeed to test decommissioning "Before the task has ended", I think then we need another BlockManager PR to actually realize it :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe we need another block manager PR to realize it, I think this is just a flaky test because we took out the original sleep and tried to use TestUtils which doesn't do a good enough job of waiting for the executor to fully come up.
Since doPut is called before the task starts computation, we don't throw away any of the in-progress data.
I'll make an alternate PR to this one to illustrate my understanding and hopefully we can iron it out and make the code path clear for everyone :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool ! Looking forward to your alternate PR to fix this.
I don't think the issue is that the executor does not come up. It does come up. The issue I think is that it tries to run the task and fails because it is decommissioned.
I agree that doPut is called before the task starts computation. I didn't follow what you mean by "in-progress data" ?. Per my understanding, the task is simply failing before the iterator is even created.
Please check this for yourself by doing a decommission in the
onTaskStartlistener callback and ensuring that the task has a long enough sleep time to ensure that it waits for this decommissioning to happen.Hurray :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@holdenk, Please take a look at the PR once again. I have added another test to specifically capture the intent of decommissioning after the task has started but before the task has ended.