-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-26974 Introduce a LogRollProcedure #5408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
The failed UT looks not related. |
|
Hi Duo, would you mind taking a look in your free time ? This is the last zk-based procedure, also the last sub-task of HBASE-21488 , I'd like to help promote this a bit @Apache9 |
The PR is big, I have already started to review it few days ago but haven't finished yet... |
Thanks for the review ! I briefly wrote down the main changes in the begin of the PR, I hope that could help review :) |
| * @param backupRoot root directory path to backup | ||
| * @throws IOException exception | ||
| */ | ||
| public Long getRegionServerLastLogRollResult(String server, String backupRoot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return long? Seems the return value can never be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll address it. Thanks Duo.
| return Flow.HAS_MORE_STATE; | ||
| case LOG_ROLL_ROLL_LOG_ON_EACH_RS: | ||
| final List<ServerName> onlineServers = | ||
| env.getMasterServices().getServerManager().getOnlineServersList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that we have race here and miss some region servers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we'd better access it under lock protection. I didn't add lock for two reasons:
a. it's acceptable to miss some newly registered servers. If a server is new, we are not likely to assign regions on it, so there is no data lost.
b. In our code base, the calls to this method elsewhere are also not locked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure there is no problem. Usually it is not fixed by locking, but something like fencing. For example, before rolling we have done some preparing, and when rolling, even if we miss some new region servers, it does not cause any problems.
| table.readRegionServerLastLogRollResult(backupRoot); | ||
| final long now = EnvironmentEdgeManager.currentTime(); | ||
| for (ServerName server : onlineServers) { | ||
| long lastLogRollResult = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value is the time for last roll? Why name it lastRollResult?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll address it.
|
|
||
| @Override | ||
| public TableName getTableName() { | ||
| return BackupSystemTable.getTableName(conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we just make this procedure as table procedure? Seems a bit strange...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we will try to do some BackupSystemTable-related operations, such as creating backup namespace and the BackupSystemTable.
Anyway, I think it is okay to declare it as a table procedure or a server procedure, because as I mentioned in the beginning of this PR, the LogRollProcedure itself does not need to acquire any lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we have talked this before, maybe we need to discuss how to change the ProcedureScheduler first...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We have talked about this in HBASE-27905, and I added a new commit to address your comments. It's still in the POC stage and needs to be polished and more test cases.
| } | ||
|
|
||
| public static void rollWALWriters(Admin admin, Map<String, String> props) throws IOException { | ||
| byte[] ret = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not want to introduce a new admin method for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a. it is not general enough for Admin. This call will not only make all rs roll WAL writers, but also do some backup-related operations, such as reading and writing BackupSystemTable.
b. this operation is a bit too lightweight if introduced in the BackupAdmin, since it's only a small subprocedure of the whole backup job.
So I think maybe a static utility method is enough ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The execProcedure call is for zk based procedures, do we still have other procedures besides the log roll one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the last one.
|
Any updates here? Thanks. |
|
Will push the newest code as soon as possible. Thanks Duo ! |
|
A new commit has been added. Let's wait for the UT result. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
The failed UT looks not related. |
|
Any updates here? This is last one we need to convert from proc-v1 to proc-v2. Thanks. |
87ee48e to
9502fd3
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@Apache9 Hi duo, would you mind seeing if there are any other design or implementation problems blocking merging to master ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new LogRollProcedure using HBase's procedure framework (proc-v2) for rolling WAL writers across all region servers. It provides both backward compatibility with the existing ZooKeeper-based approach and introduces the new procedure-based implementation.
- Adds three new procedures: LogRollProcedure, RSLogRollProcedure, and RSLogRollRemoteProcedure for distributed WAL rolling
- Updates the client-side BackupUtils to handle both coordination approaches (ZK vs proc-v2) with timeout management
- Extends executor and event type support for the new LOG_ROLL operation
Reviewed Changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| HRegionServer.java | Adds new executor service for log roll operations |
| ExecutorType.java/EventType.java | Defines new LOG_ROLL executor and event types |
| ServerQueue.java/ServerProcedureInterface.java | Configures server procedure handling for log roll operations |
| BackupUtils.java | Implements unified client API supporting both ZK and proc-v2 coordination |
| LogRollMasterProcedureManager.java | Enhanced to support both coordination mechanisms |
| LogRollProcedure.java | Main procedure coordinating WAL rolling across all region servers |
| RSLogRollProcedure.java/RSLogRollRemoteProcedure.java | Server-specific log roll procedures with retry logic |
| TestLogRollProcedure.java | Test coverage for the new procedure implementation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| RS_FLUSH_OPERATIONS(37), | ||
| RS_RELOAD_QUOTAS_OPERATIONS(38); | ||
| RS_RELOAD_QUOTAS_OPERATIONS(38), | ||
| RS_LOG_ROLL(38); |
Copilot
AI
Sep 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both RS_RELOAD_QUOTAS_OPERATIONS and RS_LOG_ROLL have the same value (38). RS_LOG_ROLL should have value 39 to avoid conflicts.
| } | ||
|
|
||
| // Setup the Quota Manager | ||
| rsQuotaManager = new RegionServerRpcQuotaManager(this); |
Copilot
AI
Sep 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removed line configurationManager.registerObserver(rsQuotaManager); appears to be accidentally deleted. This registration is necessary for quota manager configuration updates.
| rsQuotaManager = new RegionServerRpcQuotaManager(this); | |
| rsQuotaManager = new RegionServerRpcQuotaManager(this); | |
| configurationManager.registerObserver(rsQuotaManager); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should implement a general LogRollProcedure, not only for the backup system. The backup system can call this procedure in its backup operation, just like what LogRollMasterProcedureManager do. And I do not think we still need to use LogRollMasterProcedureManager in the proc-v2 based procedure path?
Thanks.
|
Thanks for reviewing Duo. If we want to introduce a general log roll procedure, we should implement it in the hbase-server module. It is not hard. But for log rolling in backup job, there is a little difference. As a sub-step of backup, after completing the log roll, we need to record the highest wal filenum in the backup system table. However hbase-backup is a high-level module built on hbase-server. In theory, hbase-server should not include any backup-related operations, so I am a little confused about how to do log rolling for backup. Here are some solutions. The second approach is to introduce new procedures in the hbase-backup module, such as BackupLogRollProcedure extends LogRollProcedure and BackupLogRollCallable extends LogRollCallable, and implement backup-related logic in these derived subprocedures. However, we still lack an entry point for submitting such procedures. Should we add a new Backup Service to Backup.proto? Actually I don't like this approach. This implementation is too complicated for the purpose. How about I try the first method first? |
3bf6fa8 to
b2c7eb5
Compare
Hi duo , would you mind reviewing the new commit ? @Apache9 |
This comment has been minimized.
This comment has been minimized.
What about make the new rollAllWALWriters method returns the last wal file number? Although not commonly used, but we do have a result field in the Procedure class. |
|
Oh, thanks for the reminder, this is indeed better. Let me address. One more word, even we move collecting regionserver wal filenum from client to master, RSRpcService still needs to provide an interface for the master to query the latest filenum. When implementing SnapshotProcedure before, I had thought about changing the signature of RSProcedureCallable from |
This comment has been minimized.
This comment has been minimized.
We could add a field in the RemoteProcedureResult message to report the data back? |
Yes, I thought so too. if you think 3.0 needs to include this feature, I can file a new issue to address it. |
|
Add a new commit to address comments. Let's wait for the unit tests result. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Hi duo, this feature has been addressed in this PR. Would you mind taking a look ? @Apache9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
Just a concern about the protobuf defination.
And better cleanup the javac and checkstyle warnings.
|
Add a new commit to fix checkstyle problem. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| return Flow.NO_MORE_STATE; | ||
| } | ||
| } catch (Exception e) { | ||
| setFailure("log-roll", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add retry here? Anyway, can be a separated issue, since the procedure does not need any cleanup or rollback after failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log roll is a pretty light-weight operation, so I think maybe it would be better to fail fast. And In LogRollCallable, we will retry, and the retry time can be set by hbase.regionserver.logroll.retries.
Co-authored-by: huiruan <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit ffed09d)
| } | ||
| } | ||
|
|
||
| private static void logRollV2(Connection conn, String backupRootDir) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this method handles cleaning up any servers that no longer exist on the cluster, which means we'll hang onto oldWALs from those hosts indefinitely due to the BackupLogCleaner
Please correct me if I'm misunderstanding, but I think we also want to make sure that we're removing entries from the system tables for hosts that are no longer part of the cluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need something like
BackupSystemTable
public void deleteRegionServerLastLogRollResult(String server, String backupRoot) throws IOException {
LOG.trace("delete region server last roll log result to backup system table");
try (Table table = connection.getTable(tableName)) {
Delete delete = new Delete(rowkey(RS_LOG_TS_PREFIX, backupRoot, NULL, server));
table.delete(delete);
}
}BackupUtils
private static void logRollV2(Connection conn, String backupRootDir) throws IOException {
BackupSystemTable backupSystemTable = new BackupSystemTable(conn);
HashMap<String, Long> lastLogRollResult =
backupSystemTable.readRegionServerLastLogRollResult(backupRootDir);
try (Admin admin = conn.getAdmin()) {
Map<ServerName, Long> newLogRollResult = admin.rollAllWALWriters();
for (Map.Entry<ServerName, Long> entry : newLogRollResult.entrySet()) {
ServerName serverName = entry.getKey();
long newHighestWALFilenum = entry.getValue();
String address = serverName.getAddress().toString();
Long lastHighestWALFilenum = lastLogRollResult.get(address);
if (lastHighestWALFilenum != null && lastHighestWALFilenum > newHighestWALFilenum) {
LOG.warn("Won't update last roll log result for server {}: current = {}, new = {}",
serverName, lastHighestWALFilenum, newHighestWALFilenum);
} else {
backupSystemTable.writeRegionServerLastLogRollResult(address, newHighestWALFilenum,
backupRootDir);
if (LOG.isDebugEnabled()) {
LOG.debug("updated last roll log result for {} from {} to {}", serverName,
lastHighestWALFilenum, newHighestWALFilenum);
}
}
}
// New Code Here
for (String server: lastLogRollResult.keySet()) {
if (!newLogRollResult.containsKey(ServerName.parseServerName(server))) {
backupSystemTable.deleteRegionServerLastLogRollResult(server, backupRootDir);
}
}
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this method handles cleaning up any servers that no longer exist on the cluster, which means we'll hang onto oldWALs from those hosts indefinitely due to the BackupLogCleaner
Please correct me if I'm misunderstanding, but I think we also want to make sure that we're removing entries from the system tables for hosts that are no longer part of the cluster
Thanks for reviewing.
I think the functionality of logRollV2 should be same as logRollV1, and I don't think any additional functionality should be introduced or existing functionality should be reduced.
As for the problem of cleaning up the dead server log roll result you mentioned, I have a few questions, would you mind explaining more to help me understand ?
- Will keeping old WALs from dead servers indefinitely cause any problems?
- If clean it up, is there any chance for potential data loss?
- Does zk-based log roll (ie. the logRollV1) procedure need to clean it too ?
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and I don't think any additional functionality should be introduced or existing functionality should be reduced.
I don't necessarily agree here, though I do agree v1 isn't doing this cleanup either. To clarify, this PR doesn't introduce any issues, but thanks to your changes I think we have a good way to solve the issue I've presented. This is a beta feature, and I feel that we can iterate on the behavior of the system, esepcially if it means we're improving the system's efficiency by reducing storage overhead.
- Yes; the oldWALs can take up a non-trivial amount of space and should be cleaned up when they are no longer necessary. We're seeing cases where we are storing terabytes of unused, deletable data, which is expensive
- I'd like to talk this out, and make sure my logic makes sense. There are two types of backups. For full backups, it makes sense that we don't lose any data. We roll the WAL files, and then take a snapshot of all the HFiles on the cluster, so those WAL files are backed up. For incremental backups, we roll the WAL files then backup all WAL files from [<old_start_code>, newTimestamps). For both cases, I do not think there's any possibility of a data loss. I think as long as we delete entries from the system table after the backup completes, we should be okay
- It'd be nice, but I think it'd be a lot harder b/c logRollV1 never updates the backup system table with the newer timestamps as far as I can tell. Given this feature is still in beta, I'm happy to move forward with the v2 functionality and mark v1 as deprecated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the kind reply @hgromer
After checking the code, I also feel that deleting the dead server log roll result is not likely to cause data loss, so considering that it can save a lot of unnecessary storage space, I support deleting too. If you don't mind, could you please file another issue and open a new PR to follow up on this issue? I can help review it.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, thank you. I'll create a jira and put up a PR
This PR tries to reimplement the log-roll procedure with proc-v2.
Modifies the following things
client side:
when request all rs to roll WAL writers, instead of calling
admin.execProcedure(), now we calladmin.execProcedureWithReturnand the returned value depends on the configuration in the server side. If master is configured to used proc-v2, the value would be the procedure id, otherwise nothing. Then we will keep asking master if the procedure has finished by callingadmin.isProcedureFinisheduntil it finished or failed or timeout. This was implemented inBackupUtils#rollWALWriters.server side
enhanced
LogRollMasterProcedureManagerto support both proc-v1 and proc-v2introduce 3 new procedures.
LogRollProcedure
The
LogRollProcedureis used to roll WAL for all rs in the cluster. It does not acquire any lock and It has 3 states:LOG_ROLL_PRE_CHECK_NAMESPACE: create backup namespace if not existsLOG_ROLL_PRE_CHECK_TABLES: create backup system table and backup system bulkload table if not existsLOG_ROLL_ROLL_LOG_ON_EACH_RS: roll all rs WAL writersRSLogRollProcedure
The RSLogRollProcedure is used to schedule a RSLogRollRemoteProcedure for each regionserver. When the subprocedure returns, the RSLogRollProcedure will check the logrolling result in the backup system table. If failed, The RSLogRollProcedure will schedule a new RSLogRollRemoteProcedure to retry.
RSLogRollRemoteProcedure
The RSLogRollRemoteProcedure is used to send the log roll request to the remote server.
any suggestions and feedbacks are appreciated.