Skip to content

Conversation

ameks94
Copy link

@ameks94 ameks94 commented Nov 22, 2016

The solution is to catch "InvalidProtocolBufferException", show warning and remove application's folder that contains invalid data to prevent RM restart failure.

Additionally, I've added catch for other exceptions that can appear during recovering of the specific application, to avoid RM failure even if the only one application's state can't be loaded.

@ameks94
Copy link
Author

ameks94 commented Nov 28, 2016

Update PR to fix the checkstyle and whitespace tests failure.

@ameks94
Copy link
Author

ameks94 commented May 15, 2017

I realized that current solution is not good (to allow RM's launch even with broken app's data).
It's better to crash RM in case application's file with app's state is broken. This case we can specify more detailed information about which file is broken (Maybe to give the recommendation to remove application's folder with broken data to allow RM to be launched successfully)
Second, the most important part of the fix should be to find the reason of file's crashing and to find the way to prevent file's crash.

@ameks94 ameks94 closed this May 15, 2017
@ameks94 ameks94 deleted the YARN-5924 branch November 17, 2017 13:28
@ameks94 ameks94 restored the YARN-5924 branch November 17, 2017 13:28
This was referenced Aug 20, 2019
shanthoosh added a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
Author: Shanthoosh Venkataraman <[email protected]>

Reviewers: Navina Ramesh<[email protected]>

Closes apache#164 from shanthoosh/fix-test-jmx-server-1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant