Skip to content

Conversation

@sharmaar12
Copy link

…ot getting updated for read replica

Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29611

Description:
Steps to Repro (For detailed steps, check JIRA):

  • Create two clusters on the same storage location.
  • Create table on active, then refresh meta on the read replica to get the table meta data updated.
  • Add some rows and flush on the active cluster, do refresh_hfiles on the read replica and scan table.
  • If you now again add the rows in the table on active and do refresh_hfiles then the rows added are not visible in the read replica.

Cause:
The refresh store file is a two step process:

  1. Load the existing store file from the .filelist (choose the file with higher timestamp for loading)
  2. refresh store file internals (clean up old/compacted files, replace store file in .filelist)

In the current scenario, what is happening is that for the first time read-replica is loading the list of Hfiles from the file in .filelist created by active cluster but then it is creating the new file with greater timestamp. Now we have two files in .filelist. On the subsequent flush from active the file in .filelist created by the active gets updated but the file created by read-replica is not. While loading in the refresh_hfiles as we take the file with higher timestamp the file created by read-replica for the first time gets loaded which does not have an updated list of hfiles.

Fix:
As we just wanted the file from active to be loaded anytime we perform refresh store files, we must not create a new file in the .filelist from the read-replica, in this way we will stop the timestamp mismatch.

Also we don't want to initialize the tracker file (StoreFileListFile.java:load()) from read-replica as we are not writing it hence we have added check for read only property in StoreFileTrackerBase.java:load()

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@sharmaar12 sharmaar12 changed the title With FILE based SFT, the list of HFiles we maintain in .filelist is n… HBASE-29611: With FILE based SFT, the list of HFiles we maintain in .filelist is n… Oct 14, 2025
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ HBASE-29081 Compile Tests _
+1 💚 mvninstall 5m 32s HBASE-29081 passed
+1 💚 compile 5m 18s HBASE-29081 passed
-0 ⚠️ checkstyle 0m 23s /buildtool-branch-checkstyle-hbase-server.txt The patch fails to run checkstyle in hbase-server
+1 💚 spotbugs 2m 36s HBASE-29081 passed
+1 💚 spotless 1m 15s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 43s the patch passed
+1 💚 compile 4m 4s the patch passed
+1 💚 javac 4m 4s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 18s /buildtool-patch-checkstyle-hbase-server.txt The patch fails to run checkstyle in hbase-server
+1 💚 spotbugs 2m 6s the patch passed
+1 💚 hadoopcheck 12m 52s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 47s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 12s The patch does not generate ASF License warnings.
48m 7s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/4/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7361
JIRA Issue HBASE-29611
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 365fbfb48f0b 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-29081 / de147a0
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/4/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ HBASE-29081 Compile Tests _
+1 💚 mvninstall 3m 37s HBASE-29081 passed
+1 💚 compile 1m 3s HBASE-29081 passed
+1 💚 javadoc 0m 30s HBASE-29081 passed
+1 💚 shadedjars 6m 13s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 10s the patch passed
+1 💚 compile 1m 0s the patch passed
+1 💚 javac 1m 0s the patch passed
+1 💚 javadoc 0m 29s the patch passed
+1 💚 shadedjars 6m 6s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 235m 52s hbase-server in the patch passed.
263m 8s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/4/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7361
JIRA Issue HBASE-29611
Optional Tests javac javadoc unit compile shadedjars
uname Linux 998a791fdf9e 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-29081 / de147a0
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/4/testReport/
Max. process+thread count 3404 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/4/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, basically you want to act like secondary replicas in the read-only case, right?
Would you please add unit tests?

@sharmaar12
Copy link
Author

@anmolnar

So, basically you want to act like secondary replicas in the read-only case, right?

Technically, in some cases, Yes. But main aim is to prevent recreation of tracker file with higher timestamp on refresh.

Added unit tests please review.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 14s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ HBASE-29081 Compile Tests _
+1 💚 mvninstall 5m 52s HBASE-29081 passed
+1 💚 compile 5m 14s HBASE-29081 passed
-0 ⚠️ checkstyle 0m 34s /buildtool-branch-checkstyle-hbase-server.txt The patch fails to run checkstyle in hbase-server
+1 💚 spotbugs 2m 24s HBASE-29081 passed
+1 💚 spotless 1m 11s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 5m 0s the patch passed
+1 💚 compile 5m 0s the patch passed
+1 💚 javac 5m 0s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 33s /buildtool-patch-checkstyle-hbase-server.txt The patch fails to run checkstyle in hbase-server
+1 💚 spotbugs 2m 40s the patch passed
+1 💚 hadoopcheck 15m 42s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 1m 5s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 17s The patch does not generate ASF License warnings.
55m 24s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/5/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7361
JIRA Issue HBASE-29611
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 6c0aeab6c4e0 6.8.0-1024-aws #26~22.04.1-Ubuntu SMP Wed Feb 19 06:54:57 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-29081 / 737c33c
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 71 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/5/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 12s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ HBASE-29081 Compile Tests _
+1 💚 mvninstall 5m 34s HBASE-29081 passed
+1 💚 compile 1m 44s HBASE-29081 passed
+1 💚 javadoc 0m 55s HBASE-29081 passed
+1 💚 shadedjars 9m 57s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 5m 3s the patch passed
+1 💚 compile 1m 34s the patch passed
+1 💚 javac 1m 34s the patch passed
+1 💚 javadoc 0m 45s the patch passed
+1 💚 shadedjars 9m 6s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 19m 44s /patch-unit-hbase-server.txt hbase-server in the patch failed.
56m 47s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/5/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7361
JIRA Issue HBASE-29611
Optional Tests javac javadoc unit compile shadedjars
uname Linux 1f20fdf0933b 6.8.0-1024-aws #26~22.04.1-Ubuntu SMP Wed Feb 19 06:54:57 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-29081 / 737c33c
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/5/testReport/
Max. process+thread count 785 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7361/5/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

…filelist is not getting updated for read replica

Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29611

Description:
Steps to Repro (For detailed steps, check JIRA):
- Create two clusters on the same storage location.
- Create table on active, then refresh meta on the read replica to get the table meta data updated.
- Add some rows and flush on the active cluster, do refresh_hfiles on the read replica and scan table.
- If you now again add the rows in the table on active and do refresh_hfiles then the rows added are not visible in the read replica.

Cause:
The refresh store file is a two step process:
1. Load the existing store file from the .filelist (choose the file with higher timestamp for loading)
2. refresh store file internals (clean up old/compacted files, replace store file in .filelist)

In the current scenario, what is happening is that for the first time read-replica is loading the list of Hfiles from the file in .filelist created by active cluster but then it is creating the new file with greater timestamp. Now we have two files in .filelist. On the subsequent flush from active the file in .filelist created by the active gets updated but the file created by read-replica is not. While loading in the refresh_hfiles as we take the file with higher timestamp the file created by read-replica for the first time gets loaded which does not have an updated list of hfiles.

Fix:
As we just wanted the file from active to be loaded anytime we perform refresh store files, we must not create a new file in the .filelist from the read-replica, in this way we will stop the timestamp mismatch.

NOTE:
Also we don't want to initialize the tracker file (StoreFileListFile.java:load()) from read-replica as we are not writing it hence we have added check for read only property in StoreFileTrackerBase.java:load()
Copy link
Contributor

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, lgtm. Just need to fix the issue with the newly added UT and a minor comment I. have suggested.

Comment on lines 104 to 107
if (
isPrimaryReplica && !conf.getBoolean(HConstants.HBASE_GLOBAL_READONLY_ENABLED_KEY,
HConstants.HBASE_GLOBAL_READONLY_ENABLED_DEFAULT)
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add same check to the add and set method? It would give an extra safeguard just in case these methods are inadvertently called on read-replica.

Copy link
Author

@sharmaar12 sharmaar12 Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wchevreuil We haven't added any prevention mechanism which will avoid calls to add and set function in read-only mode, so are you suggesting to add it to these methods?

Without these unit tests will always fails, as here I am deliberately calling them in test function.

Copy link
Contributor

@anmolnar anmolnar Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to clone the behavior of secondary replicas for the read-only mode, which makes perfect sense to me, I think you should follow it everywhere in this class. In addition to add(), set() and replace() methods, you might want to add the same safeguard to createWriter() method as well. Could be also useful to extract the check in a separate method to reduce code duplication.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anmolnar @wchevreuil
Added code changes as suggested. Please review.

I do see there are two more functions createReference and removeStoreFiles, do you think we should safeguard these functions too? I don't see these functions safeguarded for secondary replicas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see these functions safeguarded for secondary replicas.

In which case we probably don't need to.

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Please address @wchevreuil 's comment.

Copy link
Contributor

@Kota-SH Kota-SH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@kgeisz kgeisz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anmolnar anmolnar merged commit 365f16e into apache:HBASE-29081 Oct 27, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants