-
Couldn't load subscription status.
- Fork 9.1k
HDFS-16043. Add markedDeleteBlockScrubberThread to delete blocks asynchronously #3063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zhuxiangyi for your works here. It is very interesting improvement. Leave some minor comment inline. PTAL.
cc @jojochuang @ayushtkn @goiri Would you mind give another checks?
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
Outdated
Show resolved
Hide resolved
...oject/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java
Outdated
Show resolved
Hide resolved
...ect/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
Outdated
Show resolved
Hide resolved
|
@zhuxiangyi This PR has conflict to trunk now. Please rebase it first and check failed unit test if it is related to changes. Thanks. |
3bee0e3 to
a92b060
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
@Hexiaoqiao Thank you for your review, the above check failure seems to have nothing to do with this patch. |
|
Please fix the checkstyle reported by Yetus first. Other looks good to me. I would like to give my +1 when fix that. Before commit to trunk we should wait for other guys give another reviews. Thanks @zhuxiangyi . |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work and sharing the flame graph, which makes it easy to validate the change.
However, I am still not able to understand why the change improves delete performance. The delete op is done in two steps, step 1 acquire lock, collect blocks, release lock. step 2 acquire lock, delete blocks, release lock.
The change essentially moves the step2 to another thread. IMO, this approach reduces client perceived latency, which is good. But deleting the blocks still requires holding namespace lock. Why does it avoid NN unresponsiveness?
Is it because instead of releasing the lock after a specified number of blocks, it releases the lock after an absolute time. I can image the absolute time is a better metric because deleting a block does take a variable duration of time, not constant.
A few minor comments changes requested:
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
@jojochuang Thanks for your comment and review, as you commented, the current modification is only to delete the block asynchronously. The QuotaCount calculation optimization described in jira can reduce the time to collect blocks. I plan to open a new problem to solve it. |
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll review.
can you rebase the PR? it's got conflicts.
53f1dfb to
20633a5
Compare
@jojochuang already rebase. |
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
...ct/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
Outdated
Show resolved
Hide resolved
35ddddb to
e9999ac
Compare
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
Can we update the subject of this PR? As mentioned in a previous comment this PR makes block removal asynchronous and does actually speed up delete. #3063 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1 from my side.
Please consider the following improvements at next loop.
A. Just notice that there are several other operations also invoke remove Block as Wei-Chiu mentioned above, I think we should process them asynchronously.
B. Pending deletion blocks number will be concerned by end user IMO, it is better to add it as another metric index.
Thanks @zhuxiangyi for your works!
|
Will commit to trunk wait for two days if no more other comments. |
91c8940 to
108e96c
Compare
|
Added pendingDeleteBlocksCount metrics please review @Hexiaoqiao |
|
💔 -1 overall
This message was automatically generated. |
108e96c to
0d1f846
Compare
|
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1.
|
Will commit to trunk once jenkins completed. |
|
🎊 +1 overall
This message was automatically generated. |
|
@zhuxiangyi Do you mind to fix the checkstyle refer to https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3063/19/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt. Thanks. |
|
@Hexiaoqiao resubmitted |
|
Commit to trunk based on build result (#3063 (comment)) and checkstyle fix only(f6f2793). Thanks @zhuxiangyi for your contribution. Thanks @jojochuang and @tomscut for you reviews. |
|
@zhuxiangyi It is conflict when cherry-pick to other active branches. Do you mind to create another PR to branch-3.3? |
|
ok, I will submit it later. |
|
💔 -1 overall
This message was automatically generated. |
…chronously (apache#3063). Contributed by Xiangyi Zhu. Reviewed-by: tomscut <[email protected]> Signed-off-by: Wei-Chiu Chuang <[email protected]> Signed-off-by: He Xiaoqiao <[email protected]>
HDFS-16043