Skip to content

Conversation

@ZanderXu
Copy link
Contributor

Description of PR

During locating some abnormal cases about block replication in our prod environment, I found that BlockManager does not out put some logs in addStoredBlock even though logEveryBlock is true.

I feel that we need to change the log level from DEBUG to INFO when logEveryBlock is true. So that we can more easily locate some abnormal cases.

private Block addStoredBlock(final BlockInfo block,
                               final Block reportedBlock,
                               DatanodeStorageInfo storageInfo,
                               DatanodeDescriptor delNodeHint,
                               boolean logEveryBlock)
  throws IOException {
    ....
      if (logEveryBlock) {
        blockLog.debug("BLOCK* addStoredBlock: {} is added to {} (size={})",
            node, storedBlock, storedBlock.getNumBytes());
      }
    ...
  }

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 54s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 25s trunk passed
+1 💚 compile 1m 43s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 1m 34s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 20s trunk passed
+1 💚 mvnsite 1m 39s trunk passed
+1 💚 javadoc 1m 20s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 1m 42s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 54s trunk passed
+1 💚 shadedclient 26m 17s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 25s the patch passed
+1 💚 compile 1m 36s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 1m 36s the patch passed
+1 💚 compile 1m 29s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 5s the patch passed
+1 💚 mvnsite 1m 31s the patch passed
+1 💚 javadoc 1m 1s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 1m 32s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 44s the patch passed
+1 💚 shadedclient 27m 8s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 368m 59s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
488m 44s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4559/1/artifact/out/Dockerfile
GITHUB PR #4559
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux d9839405a066 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / e1406fa
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4559/1/testReport/
Max. process+thread count 2036 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4559/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

(node.isDecommissioned() || node.isDecommissionInProgress()) ? 0 : 1;
if (logEveryBlock) {
blockLog.debug("BLOCK* addStoredBlock: {} is added to {} (size={})",
blockLog.info("BLOCK* addStoredBlock: {} is added to {} (size={})",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Please double check if it could flood general log in some corner case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Hexiaoqiao for your review. We have released it in our prod environment and it works well.

logEveryBlock=true means that we should output some logs about this block. And I have checked the callers and they are expected.

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. +1 from my side.

@slfan1989
Copy link
Contributor

@ZanderXu Thank you for your contribution, pr looks good, but I want to know if it can be solved using dynamic debug log?

I remember that hadoop provides a switch that can directly open the debug log. This command can temporarily modify the log level.

hadoop daemonlog --setlevel <host:port> <name> <level>

If you directly modify the log level, will the log size be very large, after all, this is a block-level log?

@ZanderXu
Copy link
Contributor Author

ZanderXu commented Jul 26, 2022

Thanks @slfan1989 for your review.

This command can temporarily modify the log level.

This command can change the log level of the logger. But it will print all debug log of the logger.

If you directly modify the log level, will the log size be very large, after all, this is a block-level log?

The log size is controllable because logEveryBlock will control it. And although it's a block-level log, it only printed the log when the replica of the block is changed.

Maybe we should think about its use. This log is very helpful for us to locate some abnormal case about replica of block, such as complete failure, missing block, etc...

@slfan1989
Copy link
Contributor

The log size is controllable because logEveryBlock will control it. And although it's a block-level log, it only printed the log when the replica of the block is changed.

Maybe we should think about its use. This log is very helpful for us to locate some abnormal case about replica of block, such as complete failure, missing block, etc...

Thanks for your explanation, I understand your changes.

@Hexiaoqiao Hexiaoqiao merged commit a5adc27 into apache:trunk Jul 28, 2022
@Hexiaoqiao
Copy link
Contributor

Committed to trunk. Thanks @ZanderXu for your contributions. Thanks @slfan1989 for your comments.

@tasanuma
Copy link
Member

Hi, @ZanderXu.
May I ask what percentage of the BLOCK* addStoredBlock: logs are in the total logs of NameNode in your cluster? Doesn't this change affect the performance of NameNode?

HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants