Skip to content

Conversation

@DieterDP-ng
Copy link
Contributor

Adds javadoc mentioning the concurrent usage and thread-safety need of FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class tracks timestamps to block the deletion of recently loaded HFiles that might be needed for backup purposes. The timestamps were being registered from inside the concurrent method, which could result in recently added files getting deleted. Moved the timestamp registration to the postClean method, which is called only a single time per cleaner run, so recently loaded HFiles are in fact protected from deletion.

@DieterDP-ng
Copy link
Contributor Author

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@DieterDP-ng
Copy link
Contributor Author

Testing failures seem unrelated to the PR.

// timestamp of 2nd most recent read from backup system table
private long secondPrevReadFromBackupTbl = 0;
// timestamp of most recent completed cleaning run
private volatile long previousCleaningCompletionTimestamp = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the first run of the chore during the process lifecycle will always do nothing? Is there something else we can base it's initialization value on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. That was also the (intended, but not actual) behavior of this cleaner before this PR I think.

I guess we can just initialize this to X time before the current time. Could pick X rudimentary, I think a minute is sufficient, or we could set X to the interval in which the cleaner runs, if I trace the code correctly, that would be the value of hbase.master.cleaner.interval.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 19s Maven dependency ordering for branch
+1 💚 mvninstall 5m 19s master passed
+1 💚 compile 6m 24s master passed
+1 💚 checkstyle 1m 35s master passed
+1 💚 spotbugs 3m 23s master passed
+1 💚 spotless 1m 17s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 4m 52s the patch passed
+1 💚 compile 5m 50s the patch passed
+1 💚 javac 5m 50s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 20s the patch passed
+1 💚 spotbugs 4m 8s the patch passed
+1 💚 hadoopcheck 17m 35s Patch does not cause any errors with Hadoop 3.3.6 3.4.1.
+1 💚 spotless 1m 13s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 28s The patch does not generate ASF License warnings.
65m 21s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7360/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7360
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 63dca720c8f6 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d2e22d
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 88 (vs. ulimit of 30000)
modules C: hbase-server hbase-backup U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7360/3/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Copy link

@krconv krconv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why the cleaner bases the HFile "grace period" threshold on the last run of the cleaner; to me it seems like just using a threshold of 60 seconds before preClean() is called (an therefore 60 seconds before the files are traversed on disk and before the backup table is scanned) would be simpler. But agree the logic here was flawed and that this seems to fix it

private volatile long previousCleaningCompletionTimestamp = 0;

@Override
public void postClean() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this definitely looks like an oversight from the original implementation, and that this change corrects it to work as it seems it was originally intended

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for branch
+1 💚 mvninstall 5m 43s master passed
+1 💚 compile 2m 10s master passed
+1 💚 javadoc 1m 21s master passed
+1 💚 shadedjars 10m 17s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 28s Maven dependency ordering for patch
+1 💚 mvninstall 5m 18s the patch passed
+1 💚 compile 2m 15s the patch passed
+1 💚 javac 2m 15s the patch passed
+1 💚 javadoc 1m 23s the patch passed
+1 💚 shadedjars 9m 13s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 248m 59s hbase-server in the patch passed.
+1 💚 unit 11m 19s hbase-backup in the patch passed.
304m 38s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7360/3/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7360
Optional Tests javac javadoc unit compile shadedjars
uname Linux dbdc8a3840cd 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d2e22d
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7360/3/testReport/
Max. process+thread count 4535 (vs. ulimit of 30000)
modules C: hbase-server hbase-backup U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7360/3/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@ndimiduk ndimiduk merged commit d1bce57 into apache:master Oct 10, 2025
1 check passed
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Oct 10, 2025
)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
@ndimiduk
Copy link
Member

Merged to master and backport to branch-3 is clean. Backport to branch-2 has minor conflict in the unit test. I think this means we've missed a backport along the way. It's a minor thing re: the callCleaner() helper method. @DieterDP-ng shall I naively resolve the conflict, or is there a commit we're missing -- do you mind taking a look?

@@@ -108,12 -108,12 +108,21 @@@ public class TestBackupHFileCleaner 
      Iterable<FileStatus> deletable;
  
      // The first call will not allow any deletions because of the timestamp mechanism.
++<<<<<<< HEAD
 +    deletable = cleaner.getDeletableFiles(Arrays.asList(file1, file1Archived, file2, file3));
 +    assertEquals(Collections.emptySet(), Sets.newHashSet(deletable));
 +
 +    // No bulk loads registered, so all files can be deleted.
 +    deletable = cleaner.getDeletableFiles(Arrays.asList(file1, file1Archived, file2, file3));
 +    assertEquals(Sets.newHashSet(file1, file1Archived, file2, file3), Sets.newHashSet(deletable));
++=======
+     deletable = callCleaner(cleaner, List.of(file1, file1Archived, file2, file3));
+     assertEquals(Set.of(), Sets.newHashSet(deletable));
+ 
+     // No bulk loads registered, so all files can be deleted.
+     deletable = callCleaner(cleaner, List.of(file1, file1Archived, file2, file3));
+     assertEquals(Set.of(file1, file1Archived, file2, file3), Sets.newHashSet(deletable));
++>>>>>>> 64436d408d (HBASE-29604 BackupHFileCleaner uses flawed time based check (#7360))
  
      // Register some bulk loads.

ndimiduk pushed a commit that referenced this pull request Oct 12, 2025
Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Oct 13, 2025
)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Oct 13, 2025
)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Oct 13, 2025
)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit that referenced this pull request Oct 14, 2025
Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit that referenced this pull request Oct 14, 2025
Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Oct 14, 2025
)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk added a commit to HubSpot/hbase that referenced this pull request Oct 14, 2025
…ed check (apache#7360) (will be in 2.6.4) (#209)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
Co-authored-by: DieterDP <[email protected]>
@DieterDP-ng
Copy link
Contributor Author

Merged to master and backport to branch-3 is clean. Backport to branch-2 has minor conflict in the unit test. I think this means we've missed a backport along the way. It's a minor thing re: the callCleaner() helper method. @DieterDP-ng shall I naively resolve the conflict, or is there a commit we're missing -- do you mind taking a look?

Sorry for the late response, but I guess it's already figured out? Didn't quite get what the conflict was though.

@ndimiduk
Copy link
Member

It was a simple difference in collection utilities used on those lines, new in JDK11 APIs ... I've been away from HBase too long.

charlesconnell pushed a commit to HubSpot/hbase that referenced this pull request Oct 22, 2025
…ed check (apache#7360) (will be in 2.6.4) (#209)

Adds javadoc mentioning the concurrent usage and thread-safety need of
FileCleanerDelegate#getDeletableFiles.

Fixes a potential thread-safety issue in BackupHFileCleaner: this class
tracks timestamps to block the deletion of recently loaded HFiles that
might be needed for backup purposes. The timestamps were being registered
from inside the concurrent method, which could result in recently added
files getting deleted. Moved the timestamp registration to the postClean
method, which is called only a single time per cleaner run, so recently
loaded HFiles are in fact protected from deletion.

Signed-off-by: Nick Dimiduk <[email protected]>
Co-authored-by: DieterDP <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants