HBASE-26974 Introduce a LogRollProcedure #5408

frostruan · 2023-09-17T03:46:11Z

This PR tries to reimplement the log-roll procedure with proc-v2.

Modifies the following things

client side:

when request all rs to roll WAL writers, instead of calling admin.execProcedure(), now we call admin.execProcedureWithReturn and the returned value depends on the configuration in the server side. If master is configured to used proc-v2, the value would be the procedure id, otherwise nothing. Then we will keep asking master if the procedure has finished by calling admin.isProcedureFinished until it finished or failed or timeout. This was implemented in BackupUtils#rollWALWriters.

server side

enhanced LogRollMasterProcedureManager to support both proc-v1 and proc-v2
introduce 3 new procedures.

LogRollProcedure
The LogRollProcedure is used to roll WAL for all rs in the cluster. It does not acquire any lock and It has 3 states:
LOG_ROLL_PRE_CHECK_NAMESPACE : create backup namespace if not exists
LOG_ROLL_PRE_CHECK_TABLES : create backup system table and backup system bulkload table if not exists
LOG_ROLL_ROLL_LOG_ON_EACH_RS : roll all rs WAL writers

RSLogRollProcedure
The RSLogRollProcedure is used to schedule a RSLogRollRemoteProcedure for each regionserver. When the subprocedure returns, the RSLogRollProcedure will check the logrolling result in the backup system table. If failed, The RSLogRollProcedure will schedule a new RSLogRollRemoteProcedure to retry.

RSLogRollRemoteProcedure
The RSLogRollRemoteProcedure is used to send the log roll request to the remote server.

any suggestions and feedbacks are appreciated.

frostruan · 2023-09-18T02:54:29Z

The failed UT looks not related.

frostruan · 2023-09-24T14:04:42Z

Hi Duo, would you mind taking a look in your free time ? This is the last zk-based procedure, also the last sub-task of HBASE-21488 , I'd like to help promote this a bit @Apache9

Apache9 · 2023-09-24T14:07:19Z

Hi Duo, would you mind taking a look in your free time ? This is the last zk-based procedure, also the last sub-task of HBASE-21488 , I'd like to help promote this a bit @Apache9

The PR is big, I have already started to review it few days ago but haven't finished yet...

frostruan · 2023-09-24T14:23:55Z

Hi Duo, would you mind taking a look in your free time ? This is the last zk-based procedure, also the last sub-task of HBASE-21488 , I'd like to help promote this a bit @Apache9

The PR is big, I have already started to review it few days ago but haven't finished yet...

Thanks for the review !

I briefly wrote down the main changes in the begin of the PR, I hope that could help review :)

Apache9 · 2023-10-07T08:25:06Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java

+   * @param backupRoot root directory path to backup
+   * @throws IOException exception
+   */
+  public Long getRegionServerLastLogRollResult(String server, String backupRoot)


Why not return long? Seems the return value can never be null?

Ok. I'll address it. Thanks Duo.

Apache9 · 2023-10-07T08:54:44Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/LogRollProcedure.java

+          return Flow.HAS_MORE_STATE;
+        case LOG_ROLL_ROLL_LOG_ON_EACH_RS:
+          final List<ServerName> onlineServers =
+            env.getMasterServices().getServerManager().getOnlineServersList();


Is it possible that we have race here and miss some region servers?

Yes, we'd better access it under lock protection. I didn't add lock for two reasons:

a. it's acceptable to miss some newly registered servers. If a server is new, we are not likely to assign regions on it, so there is no data lost.

b. In our code base, the calls to this method elsewhere are also not locked.

We need to make sure there is no problem. Usually it is not fixed by locking, but something like fencing. For example, before rolling we have done some preparing, and when rolling, even if we miss some new region servers, it does not cause any problems.

Apache9 · 2023-10-07T08:56:15Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/LogRollProcedure.java

+            table.readRegionServerLastLogRollResult(backupRoot);
+          final long now = EnvironmentEdgeManager.currentTime();
+          for (ServerName server : onlineServers) {
+            long lastLogRollResult =


The value is the time for last roll? Why name it lastRollResult?

Ok. I'll address it.

Apache9 · 2023-10-07T08:59:15Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/LogRollProcedure.java

+
+  @Override
+  public TableName getTableName() {
+    return BackupSystemTable.getTableName(conf);


So here we just make this procedure as table procedure? Seems a bit strange...

Because we will try to do some BackupSystemTable-related operations, such as creating backup namespace and the BackupSystemTable.

Anyway, I think it is okay to declare it as a table procedure or a server procedure, because as I mentioned in the beginning of this PR, the LogRollProcedure itself does not need to acquire any lock.

IIRC we have talked this before, maybe we need to discuss how to change the ProcedureScheduler first...

Yes. We have talked about this in HBASE-27905, and I added a new commit to address your comments. It's still in the POC stage and needs to be polished and more test cases.

Apache9 · 2023-10-07T09:00:51Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/util/BackupUtils.java

  }

+  public static void rollWALWriters(Admin admin, Map<String, String> props) throws IOException {
+    byte[] ret =


We do not want to introduce a new admin method for this?

a. it is not general enough for Admin. This call will not only make all rs roll WAL writers, but also do some backup-related operations, such as reading and writing BackupSystemTable.

b. this operation is a bit too lightweight if introduced in the BackupAdmin, since it's only a small subprocedure of the whole backup job.

So I think maybe a static utility method is enough ?

The execProcedure call is for zk based procedures, do we still have other procedures besides the log roll one?

No, this is the last one.

Apache9 · 2024-04-17T07:00:16Z

Any updates here?

Thanks.

frostruan · 2024-04-17T07:23:24Z

Will push the newest code as soon as possible.

Thanks Duo !

frostruan · 2024-04-18T13:43:14Z

A new commit has been added. Let's wait for the UT result.

frostruan · 2024-04-19T04:00:56Z

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.1.0:test (secondPartTestsExecution) on project hbase-server: There was a timeout in the fork -> [Help 1]

The failed UT looks not related.

Apache9 · 2025-08-31T12:27:39Z

Any updates here?

This is last one we need to convert from proc-v1 to proc-v2.

@frostruan

Thanks.

frostruan · 2025-09-03T02:49:11Z

@Apache9 Hi duo, would you mind seeing if there are any other design or implementation problems blocking merging to master ?

Copilot

Pull Request Overview

This PR implements a new LogRollProcedure using HBase's procedure framework (proc-v2) for rolling WAL writers across all region servers. It provides both backward compatibility with the existing ZooKeeper-based approach and introduces the new procedure-based implementation.

Adds three new procedures: LogRollProcedure, RSLogRollProcedure, and RSLogRollRemoteProcedure for distributed WAL rolling
Updates the client-side BackupUtils to handle both coordination approaches (ZK vs proc-v2) with timeout management
Extends executor and event type support for the new LOG_ROLL operation

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
HRegionServer.java	Adds new executor service for log roll operations
ExecutorType.java/EventType.java	Defines new LOG_ROLL executor and event types
ServerQueue.java/ServerProcedureInterface.java	Configures server procedure handling for log roll operations
BackupUtils.java	Implements unified client API supporting both ZK and proc-v2 coordination
LogRollMasterProcedureManager.java	Enhanced to support both coordination mechanisms
LogRollProcedure.java	Main procedure coordinating WAL rolling across all region servers
RSLogRollProcedure.java/RSLogRollRemoteProcedure.java	Server-specific log roll procedures with retry logic
TestLogRollProcedure.java	Test coverage for the new procedure implementation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-03T03:41:14Z

hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java

+
  RS_FLUSH_OPERATIONS(37),
-  RS_RELOAD_QUOTAS_OPERATIONS(38);
+  RS_RELOAD_QUOTAS_OPERATIONS(38),
+  RS_LOG_ROLL(38);


Both RS_RELOAD_QUOTAS_OPERATIONS and RS_LOG_ROLL have the same value (38). RS_LOG_ROLL should have value 39 to avoid conflicts.

Copilot · 2025-09-03T03:41:14Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java

    }

    // Setup the Quota Manager
    rsQuotaManager = new RegionServerRpcQuotaManager(this);


The removed line configurationManager.registerObserver(rsQuotaManager); appears to be accidentally deleted. This registration is necessary for quota manager configuration updates.

Suggested change

rsQuotaManager = new RegionServerRpcQuotaManager(this);

rsQuotaManager = new RegionServerRpcQuotaManager(this);

configurationManager.registerObserver(rsQuotaManager);

Apache9

I think we should implement a general LogRollProcedure, not only for the backup system. The backup system can call this procedure in its backup operation, just like what LogRollMasterProcedureManager do. And I do not think we still need to use LogRollMasterProcedureManager in the proc-v2 based procedure path?

Thanks.

frostruan · 2025-09-05T07:53:49Z

Thanks for reviewing Duo.

If we want to introduce a general log roll procedure, we should implement it in the hbase-server module. It is not hard. But for log rolling in backup job, there is a little difference. As a sub-step of backup, after completing the log roll, we need to record the highest wal filenum in the backup system table. However hbase-backup is a high-level module built on hbase-server. In theory, hbase-server should not include any backup-related operations, so I am a little confused about how to do log rolling for backup.

Here are some solutions.
First, the backup client could start a general log roll. After the log roll procedure completes, the backup client retrieves the latest WAL filenum from each region server and records it in the backup system table. For the hbase-server module, RSRpcService would need to provide a method for the client to query the latest WAL filenum.

The second approach is to introduce new procedures in the hbase-backup module, such as BackupLogRollProcedure extends LogRollProcedure and BackupLogRollCallable extends LogRollCallable, and implement backup-related logic in these derived subprocedures. However, we still lack an entry point for submitting such procedures. Should we add a new Backup Service to Backup.proto? Actually I don't like this approach. This implementation is too complicated for the purpose.

How about I try the first method first?

frostruan · 2025-09-05T10:40:52Z

Thanks for reviewing Duo.

If we want to introduce a general log roll procedure, we should implement it in the hbase-server module. It is not hard. But for log rolling in backup job, there is a little difference. As a sub-step of backup, after completing the log roll, we need to record the highest wal filenum in the backup system table. However hbase-backup is a high-level module built on hbase-server. In theory, hbase-server should not include any backup-related operations, so I am a little confused about how to do log rolling for backup.

Here are some solutions. First, the backup client could start a general log roll. After the log roll procedure completes, the backup client retrieves the latest WAL filenum from each region server and records it in the backup system table. For the hbase-server module, RSRpcService would need to provide a method for the client to query the latest WAL filenum.

The second approach is to introduce new procedures in the hbase-backup module, such as BackupLogRollProcedure extends LogRollProcedure and BackupLogRollCallable extends LogRollCallable, and implement backup-related logic in these derived subprocedures. However, we still lack an entry point for submitting such procedures. Should we add a new Backup Service to Backup.proto? Actually I don't like this approach. This implementation is too complicated for the purpose.

How about I try the first method first?

Hi duo , would you mind reviewing the new commit ? @Apache9

Apache9 · 2025-09-05T14:23:55Z

Thanks for reviewing Duo.

If we want to introduce a general log roll procedure, we should implement it in the hbase-server module. It is not hard. But for log rolling in backup job, there is a little difference. As a sub-step of backup, after completing the log roll, we need to record the highest wal filenum in the backup system table. However hbase-backup is a high-level module built on hbase-server. In theory, hbase-server should not include any backup-related operations, so I am a little confused about how to do log rolling for backup.

Here are some solutions. First, the backup client could start a general log roll. After the log roll procedure completes, the backup client retrieves the latest WAL filenum from each region server and records it in the backup system table. For the hbase-server module, RSRpcService would need to provide a method for the client to query the latest WAL filenum.

The second approach is to introduce new procedures in the hbase-backup module, such as BackupLogRollProcedure extends LogRollProcedure and BackupLogRollCallable extends LogRollCallable, and implement backup-related logic in these derived subprocedures. However, we still lack an entry point for submitting such procedures. Should we add a new Backup Service to Backup.proto? Actually I don't like this approach. This implementation is too complicated for the purpose.

How about I try the first method first?

What about make the new rollAllWALWriters method returns the last wal file number? Although not commonly used, but we do have a result field in the Procedure class.

frostruan · 2025-09-05T14:56:06Z

Oh, thanks for the reminder, this is indeed better. Let me address.

One more word, even we move collecting regionserver wal filenum from client to master, RSRpcService still needs to provide an interface for the master to query the latest filenum. When implementing SnapshotProcedure before, I had thought about changing the signature of RSProcedureCallable from public interface RSProcedureCallable extends Callable<Void> to public interface RSProcedureCallable<T> extends Callable<T>, where type T is the information that can be returned to the Master after RSProcedureCallable is completed. However, considering that the Void type is enough for most RSProcedureCallables, we can consider this change in a later version.

Apache9 · 2025-09-06T13:58:20Z

Oh, thanks for the reminder, this is indeed better. Let me address.

One more word, even we move collecting regionserver wal filenum from client to master, RSRpcService still needs to provide an interface for the master to query the latest filenum. When implementing SnapshotProcedure before, I had thought about changing the signature of RSProcedureCallable from public interface RSProcedureCallable extends Callable<Void> to public interface RSProcedureCallable<T> extends Callable<T>, where type T is the information that can be returned to the Master after RSProcedureCallable is completed. However, considering that the Void type is enough for most RSProcedureCallables, we can consider this change in a later version.

We could add a field in the RemoteProcedureResult message to report the data back?

frostruan · 2025-09-06T14:31:05Z

Oh, thanks for the reminder, this is indeed better. Let me address.
One more word, even we move collecting regionserver wal filenum from client to master, RSRpcService still needs to provide an interface for the master to query the latest filenum. When implementing SnapshotProcedure before, I had thought about changing the signature of RSProcedureCallable from public interface RSProcedureCallable extends Callable<Void> to public interface RSProcedureCallable<T> extends Callable<T>, where type T is the information that can be returned to the Master after RSProcedureCallable is completed. However, considering that the Void type is enough for most RSProcedureCallables, we can consider this change in a later version.

We could add a field in the RemoteProcedureResult message to report the data back?

Yes, I thought so too. if you think 3.0 needs to include this feature, I can file a new issue to address it.

frostruan · 2025-09-06T17:13:13Z

Add a new commit to address comments. Let's wait for the unit tests result.

frostruan · 2025-09-12T02:43:36Z

Oh, thanks for the reminder, this is indeed better. Let me address.
One more word, even we move collecting regionserver wal filenum from client to master, RSRpcService still needs to provide an interface for the master to query the latest filenum. When implementing SnapshotProcedure before, I had thought about changing the signature of RSProcedureCallable from public interface RSProcedureCallable extends Callable<Void> to public interface RSProcedureCallable<T> extends Callable<T>, where type T is the information that can be returned to the Master after RSProcedureCallable is completed. However, considering that the Void type is enough for most RSProcedureCallables, we can consider this change in a later version.

We could add a field in the RemoteProcedureResult message to report the data back?

Yes, I thought so too. if you think 3.0 needs to include this feature, I can file a new issue to address it.

Hi duo, this feature has been addressed in this PR. Would you mind taking a look ? @Apache9

Apache9

Overall LGTM.

Just a concern about the protobuf defination.

And better cleanup the javac and checkstyle warnings.

hbase-protocol-shaded/src/main/protobuf/HBase.proto

frostruan · 2025-09-12T08:21:47Z

Add a new commit to fix checkstyle problem.
The code style of the checkstyle plugin conflicts with spotless, the code formatting of spotless takes precedence.
Also change NewServerWALRoller from java 17 record to static inner class to run checkstyle plugin.

Apache-HBase · 2025-09-12T09:30:26Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 32s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	buf	0m 0s		buf was not available.
+0 🆗	buf	0m 0s		buf was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ master Compile Tests _
+0 🆗	mvndep	0m 32s		Maven dependency ordering for branch
+1 💚	mvninstall	3m 27s		master passed
+1 💚	compile	7m 27s		master passed
+1 💚	checkstyle	2m 10s		master passed
+1 💚	spotbugs	7m 10s		master passed
+1 💚	spotless	0m 48s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 12s		Maven dependency ordering for patch
+1 💚	mvninstall	3m 7s		the patch passed
+1 💚	compile	7m 23s		the patch passed
+1 💚	cc	7m 23s		the patch passed
+1 💚	javac	7m 23s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 15s	/results-checkstyle-hbase-client.txt	hbase-client: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3)
-0 ⚠️	rubocop	0m 17s	/results-rubocop.txt	The patch generated 3 new + 445 unchanged - 0 fixed = 448 total (was 445)
+1 💚	spotbugs	8m 3s		the patch passed
+1 💚	hadoopcheck	12m 3s		Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚	hbaseprotoc	2m 53s		the patch passed
+1 💚	spotless	0m 44s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	1m 8s		The patch does not generate ASF License warnings.
		69m 1s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5408/11/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#5408
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless cc buflint bufcompat hbaseprotoc rubocop
uname	Linux 3c9273066a24 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `3b793ec`
Default Java	Eclipse Adoptium-17.0.11+9
Max. process+thread count	86 (vs. ulimit of 30000)
modules	C: hbase-protocol-shaded hbase-common hbase-client hbase-procedure hbase-server hbase-thrift hbase-shell hbase-backup U: .
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5408/11/console
versions	git=2.34.1 maven=3.9.8 spotbugs=4.7.3 rubocop=1.37.1
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2025-09-12T13:03:06Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
-0 ⚠️	yetus	0m 3s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ master Compile Tests _
+0 🆗	mvndep	0m 34s		Maven dependency ordering for branch
+1 💚	mvninstall	3m 31s		master passed
+1 💚	compile	3m 29s		master passed
+1 💚	javadoc	2m 16s		master passed
+1 💚	shadedjars	6m 20s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 13s		Maven dependency ordering for patch
+1 💚	mvninstall	3m 10s		the patch passed
+1 💚	compile	3m 29s		the patch passed
+1 💚	javac	3m 29s		the patch passed
+1 💚	javadoc	2m 15s		the patch passed
+1 💚	shadedjars	6m 10s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
+1 💚	unit	0m 33s		hbase-protocol-shaded in the patch passed.
+1 💚	unit	2m 15s		hbase-common in the patch passed.
+1 💚	unit	1m 29s		hbase-client in the patch passed.
+1 💚	unit	1m 32s		hbase-procedure in the patch passed.
+1 💚	unit	212m 30s		hbase-server in the patch passed.
+1 💚	unit	6m 43s		hbase-thrift in the patch passed.
+1 💚	unit	6m 52s		hbase-shell in the patch passed.
+1 💚	unit	11m 16s		hbase-backup in the patch passed.
		281m 43s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5408/11/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#5408
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux 2b076519e837 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `3b793ec`
Default Java	Eclipse Adoptium-17.0.11+9
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5408/11/testReport/
Max. process+thread count	4273 (vs. ulimit of 30000)
modules	C: hbase-protocol-shaded hbase-common hbase-client hbase-procedure hbase-server hbase-thrift hbase-shell hbase-backup U: .
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5408/11/console
versions	git=2.34.1 maven=3.9.8
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache9 · 2025-09-12T15:08:37Z

hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/LogRollProcedure.java

+          return Flow.NO_MORE_STATE;
+      }
+    } catch (Exception e) {
+      setFailure("log-roll", e);


Do we want to add retry here? Anyway, can be a separated issue, since the procedure does not need any cleanup or rollback after failure.

Log roll is a pretty light-weight operation, so I think maybe it would be better to fail fast. And In LogRollCallable, we will retry, and the retry time can be set by hbase.regionserver.logroll.retries.

Co-authored-by: huiruan <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit ffed09d)

hgromer · 2025-09-18T17:10:49Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/util/BackupUtils.java

+    }
+  }
+
+  private static void logRollV2(Connection conn, String backupRootDir) throws IOException {


I don't think that this method handles cleaning up any servers that no longer exist on the cluster, which means we'll hang onto oldWALs from those hosts indefinitely due to the BackupLogCleaner

Please correct me if I'm misunderstanding, but I think we also want to make sure that we're removing entries from the system tables for hosts that are no longer part of the cluster

Do we need something like

BackupSystemTable

public void deleteRegionServerLastLogRollResult(String server, String backupRoot) throws IOException { LOG.trace("delete region server last roll log result to backup system table"); try (Table table = connection.getTable(tableName)) { Delete delete = new Delete(rowkey(RS_LOG_TS_PREFIX, backupRoot, NULL, server)); table.delete(delete); } }

BackupUtils

private static void logRollV2(Connection conn, String backupRootDir) throws IOException { BackupSystemTable backupSystemTable = new BackupSystemTable(conn); HashMap<String, Long> lastLogRollResult = backupSystemTable.readRegionServerLastLogRollResult(backupRootDir); try (Admin admin = conn.getAdmin()) { Map<ServerName, Long> newLogRollResult = admin.rollAllWALWriters(); for (Map.Entry<ServerName, Long> entry : newLogRollResult.entrySet()) { ServerName serverName = entry.getKey(); long newHighestWALFilenum = entry.getValue(); String address = serverName.getAddress().toString(); Long lastHighestWALFilenum = lastLogRollResult.get(address); if (lastHighestWALFilenum != null && lastHighestWALFilenum > newHighestWALFilenum) { LOG.warn("Won't update last roll log result for server {}: current = {}, new = {}", serverName, lastHighestWALFilenum, newHighestWALFilenum); } else { backupSystemTable.writeRegionServerLastLogRollResult(address, newHighestWALFilenum, backupRootDir); if (LOG.isDebugEnabled()) { LOG.debug("updated last roll log result for {} from {} to {}", serverName, lastHighestWALFilenum, newHighestWALFilenum); } } } // New Code Here for (String server: lastLogRollResult.keySet()) { if (!newLogRollResult.containsKey(ServerName.parseServerName(server))) { backupSystemTable.deleteRegionServerLastLogRollResult(server, backupRootDir); } } } }

I don't think that this method handles cleaning up any servers that no longer exist on the cluster, which means we'll hang onto oldWALs from those hosts indefinitely due to the BackupLogCleaner

Please correct me if I'm misunderstanding, but I think we also want to make sure that we're removing entries from the system tables for hosts that are no longer part of the cluster

Thanks for reviewing.

I think the functionality of logRollV2 should be same as logRollV1, and I don't think any additional functionality should be introduced or existing functionality should be reduced.

As for the problem of cleaning up the dead server log roll result you mentioned, I have a few questions, would you mind explaining more to help me understand ?

Will keeping old WALs from dead servers indefinitely cause any problems?

If clean it up, is there any chance for potential data loss?

Does zk-based log roll (ie. the logRollV1) procedure need to clean it too ?

Thanks.

and I don't think any additional functionality should be introduced or existing functionality should be reduced.

I don't necessarily agree here, though I do agree v1 isn't doing this cleanup either. To clarify, this PR doesn't introduce any issues, but thanks to your changes I think we have a good way to solve the issue I've presented. This is a beta feature, and I feel that we can iterate on the behavior of the system, esepcially if it means we're improving the system's efficiency by reducing storage overhead.

Yes; the oldWALs can take up a non-trivial amount of space and should be cleaned up when they are no longer necessary. We're seeing cases where we are storing terabytes of unused, deletable data, which is expensive

I'd like to talk this out, and make sure my logic makes sense. There are two types of backups. For full backups, it makes sense that we don't lose any data. We roll the WAL files, and then take a snapshot of all the HFiles on the cluster, so those WAL files are backed up. For incremental backups, we roll the WAL files then backup all WAL files from [<old_start_code>, newTimestamps). For both cases, I do not think there's any possibility of a data loss. I think as long as we delete entries from the system table after the backup completes, we should be okay

It'd be nice, but I think it'd be a lot harder b/c logRollV1 never updates the backup system table with the newer timestamps as far as I can tell. Given this feature is still in beta, I'm happy to move forward with the v2 functionality and mark v1 as deprecated

Thanks for the kind reply @hgromer

After checking the code, I also feel that deleting the dead server log roll result is not likely to cause data loss, so considering that it can save a lot of unnecessary storage space, I support deleting too. If you don't mind, could you please file another issue and open a new PR to follow up on this issue? I can help review it.

Thanks.

Sounds good to me, thank you. I'll create a jira and put up a PR