Skip to content

Conversation

@hgromer
Copy link
Contributor

@hgromer hgromer commented Oct 3, 2024

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@hgromer
Copy link
Contributor Author

hgromer commented Oct 3, 2024

Addressing the test failures

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@rmdmattingly rmdmattingly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small idea, but this looks good to me

import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;
import org.apache.yetus.audience.InterfaceAudience;

@InterfaceAudience.Private
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be public and catchable? I could see applications wanting to run a full backup instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout, updated

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@rmdmattingly
Copy link
Contributor

There's a checkstyle violation to fix here: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/5/artifact/yetus-general-check/output/results-checkstyle-hbase-backup.txt

Otherwise this is looking good. Tagging @DieterDP-ng as well in case he has thoughts

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@DieterDP-ng DieterDP-ng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me.

I do wonder if this case couldn't be solved in the restoration code as well. So, rather than forcing a user to create a full backup instead, that the restore code would automatically create the column family. I'm less familiar with that, so no idea regarding the effort required for that.

Another corner case I'm curious about is what would happen if a CF is deleted and re-created with the same name. I'd assume deleting a CF would delete the data. So would that mean that below scenario would result in too much data being restored?

  • Have a table with CF cf1 and some data d1 in there.
  • Create a full backup.
  • Delete cf1, recreate it
  • Add some new data d2 in cf1
  • Create an incremental backup.
  • Restore the incremental backup: will it contain only d2 (expected), or also d1 (unexpected)?

try {
Map<TableName, String> tablesToFullBackupIds = getFullBackupIds();

try (BackupAdminImpl backupAdmin = new BackupAdminImpl(conn)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of allowing programmatic recovery form this case, I'd suggest to:

  • check all tables before throwing an exception
  • adapt ColumnFamilyMismatchException so it contains a field with the mismatched tables.

That way, users will be able to do:

try{
  createIncrementalBackup(...)
} catch (ColumnFamilyMismatchException e) {
  createFullBackup(e.mismatchedTables);
  createIncrementalBackup(...);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, do you think it makes sense to add the exception to the method signature?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. I'm a big fan of fine-grained exception info at the method level (both in the javadoc and the throws statement).

return results;
}

private void verifyHtd(TableName tn, BackupInfo fullBackupInfo) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add javadoc to new methods where the goal is not clear from the name? (We know the existing code isn't a great example here.)

@hgromer
Copy link
Contributor Author

hgromer commented Oct 7, 2024

The code looks good to me.

I do wonder if this case couldn't be solved in the restoration code as well. So, rather than forcing a user to create a full backup instead, that the restore code would automatically create the column family. I'm less familiar with that, so no idea regarding the effort required for that.

It's an interesting idea, but I do think that preventing mismatch CFs outright is a potentially easy way to avoid a bunch of edge cases down the line. It also keeps the implementation fairly simple. Speaking anecdotally, adding or removing CFs isn't a very frequent HBase operation we execute at my company. In which case, I think erring on the side of simplicity might be the correct path forward here.

Another corner case I'm curious about is what would happen if a CF is deleted and re-created with the same name. I'd assume deleting a CF would delete the data. So would that mean that below scenario would result in too much data being restored?

  • Have a table with CF cf1 and some data d1 in there.
  • Create a full backup.
  • Delete cf1, recreate it
  • Add some new data d2 in cf1
  • Create an incremental backup.
  • Restore the incremental backup: will it contain only d2 (expected), or also d1 (unexpected)?

At that point, I think you'd restore all the data from d1 and d2. I don't know if there's anyway to check the history of a table's Table descriptor. So at the moment we'd lose any information that happens between backups. That being said, I think the behavior makes sense if you consider backups a snapshot of a single point in time. When the full backup was taken, cf1 existed with d1. When the incremental backup was taken, cf1 existed with d2. So it makes sense to restore the full dataset.

Similarly, if you're restoring from a full backup after you've deleted a CF that exists in the full backup, you'd be restoring more data than intended.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@DieterDP-ng
Copy link
Contributor

The code looks good to me.
I do wonder if this case couldn't be solved in the restoration code as well. So, rather than forcing a user to create a full backup instead, that the restore code would automatically create the column family. I'm less familiar with that, so no idea regarding the effort required for that.

It's an interesting idea, but I do think that preventing mismatch CFs outright is a potentially easy way to avoid a bunch of edge cases down the line. It also keeps the implementation fairly simple. Speaking anecdotally, adding or removing CFs isn't a very frequent HBase operation we execute at my company. In which case, I think erring on the side of simplicity might be the correct path forward here.

Agreed.

Another corner case I'm curious about is what would happen if a CF is deleted and re-created with the same name. I'd assume deleting a CF would delete the data. So would that mean that below scenario would result in too much data being restored?

  • Have a table with CF cf1 and some data d1 in there.
  • Create a full backup.
  • Delete cf1, recreate it
  • Add some new data d2 in cf1
  • Create an incremental backup.
  • Restore the incremental backup: will it contain only d2 (expected), or also d1 (unexpected)?

At that point, I think you'd restore all the data from d1 and d2. I don't know if there's anyway to check the history of a table's Table descriptor. So at the moment we'd lose any information that happens between backups. That being said, I think the behavior makes sense if you consider backups a snapshot of a single point in time. When the full backup was taken, cf1 existed with d1. When the incremental backup was taken, cf1 existed with d2. So it makes sense to restore the full dataset.

Similarly, if you're restoring from a full backup after you've deleted a CF that exists in the full backup, you'd be restoring more data than intended.

I agree that a backup is a point-in-time snapshot. From that point of view, a backup shouldn't restore data that was not present at the time it was taken. It isn't the case for full backups, and shouldn't be the case for incremental backups either. (The different backup mechanisms should be just be implementation details to get the same behavior.)

Possible solutions I see:

  • Adding a hook that automatically marks backups as "full-backup only" (in the backup system table I guess) when a CF gets deleted.
  • Include CF deletions in the WAL, so they are replayed correctly when restoring incremental backups.
  • Find some way to detect whether CFs have re-created since the last backup. Perhaps there's some kind of generation-id that exists, or could be introduced?

Anyway, none of the above needs to be solved in this ticket. This ticket is a nice improvement on its own. The above should be logged as a new ticket though (preferably after verifying that unintended data is being restored), so we're aware of it.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 3m 36s master passed
+1 💚 compile 0m 30s master passed
+1 💚 checkstyle 0m 10s master passed
+1 💚 spotbugs 0m 30s master passed
+1 💚 spotless 0m 43s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 54s the patch passed
+1 💚 compile 0m 28s the patch passed
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 9s the patch passed
+1 💚 spotbugs 0m 34s the patch passed
+1 💚 hadoopcheck 10m 55s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 43s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 10s The patch does not generate ASF License warnings.
28m 36s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/8/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6340
JIRA Issue HBASE-28897
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 4e3f7c84d4b5 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 1b9a68f
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 84 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/8/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 45s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 7s master passed
+1 💚 compile 0m 21s master passed
+1 💚 javadoc 0m 17s master passed
+1 💚 shadedjars 5m 21s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 58s the patch passed
+1 💚 compile 0m 18s the patch passed
+1 💚 javac 0m 18s the patch passed
+1 💚 javadoc 0m 14s the patch passed
+1 💚 shadedjars 5m 22s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 11m 48s hbase-backup in the patch passed.
31m 39s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/8/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6340
JIRA Issue HBASE-28897
Optional Tests javac javadoc unit compile shadedjars
uname Linux 4f7976b7807b 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 1b9a68f
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/8/testReport/
Max. process+thread count 3138 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/8/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@rmdmattingly rmdmattingly merged commit 6cdc4b6 into apache:master Oct 8, 2024
1 check passed
rmdmattingly pushed a commit that referenced this pull request Oct 8, 2024
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly pushed a commit that referenced this pull request Oct 8, 2024
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly pushed a commit that referenced this pull request Oct 8, 2024
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly pushed a commit that referenced this pull request Oct 9, 2024
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
Signed-off-by: Ray Mattingly <[email protected]>
rmdmattingly added a commit that referenced this pull request Oct 9, 2024
…6358)

Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Romer <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
rmdmattingly added a commit that referenced this pull request Oct 9, 2024
…6358)

Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Romer <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
rmdmattingly added a commit that referenced this pull request Oct 9, 2024
…6357)

Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Romer <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
rmdmattingly added a commit that referenced this pull request Oct 10, 2024
…6358) (#6359)

Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Romer <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]>
Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @hgromer but we cannot ship this as is due to the api compatibility annotation.

import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;
import org.apache.yetus.audience.InterfaceAudience;

@InterfaceAudience.Public
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya folks. This class hierarchy mixes IA public and private -- no can do. We can't just make its parent class, BackupException, public, because it exposes a non-public member.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, I'm more than happy to make this extends Exception instead. I'm not necessarily sure that this is an IOException, per-se

try {
fs = FileSystem.get(new URI(fullBackupInfo.getBackupRootDir()), conf);
} catch (URISyntaxException e) {
throw new IOException("Unable to get fs", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "unable to get fs for backup foo."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address this

hgromer added a commit to HubSpot/hbase that referenced this pull request Feb 3, 2025
hgromer added a commit to HubSpot/hbase that referenced this pull request Feb 3, 2025
hgromer added a commit to HubSpot/hbase that referenced this pull request Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants