- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.4k
HBASE-28897: Force CF compatibility during incremental backup #6340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| Addressing the test failures | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small idea, but this looks good to me
| import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor; | ||
| import org.apache.yetus.audience.InterfaceAudience; | ||
|  | ||
| @InterfaceAudience.Private | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be public and catchable? I could see applications wanting to run a full backup instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout, updated
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| There's a checkstyle violation to fix here: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6340/5/artifact/yetus-general-check/output/results-checkstyle-hbase-backup.txt Otherwise this is looking good. Tagging @DieterDP-ng as well in case he has thoughts | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me.
I do wonder if this case couldn't be solved in the restoration code as well. So, rather than forcing a user to create a full backup instead, that the restore code would automatically create the column family. I'm less familiar with that, so no idea regarding the effort required for that.
Another corner case I'm curious about is what would happen if a CF is deleted and re-created with the same name. I'd assume deleting a CF would delete the data. So would that mean that below scenario would result in too much data being restored?
- Have a table with CF cf1 and some data d1 in there.
- Create a full backup.
- Delete cf1, recreate it
- Add some new data d2 in cf1
- Create an incremental backup.
- Restore the incremental backup: will it contain only d2 (expected), or also d1 (unexpected)?
| try { | ||
| Map<TableName, String> tablesToFullBackupIds = getFullBackupIds(); | ||
|  | ||
| try (BackupAdminImpl backupAdmin = new BackupAdminImpl(conn)) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of allowing programmatic recovery form this case, I'd suggest to:
- check all tables before throwing an exception
- adapt ColumnFamilyMismatchExceptionso it contains a field with the mismatched tables.
That way, users will be able to do:
try{
  createIncrementalBackup(...)
} catch (ColumnFamilyMismatchException e) {
  createFullBackup(e.mismatchedTables);
  createIncrementalBackup(...);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, do you think it makes sense to add the exception to the method signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. I'm a big fan of fine-grained exception info at the method level (both in the javadoc and the throws statement).
| return results; | ||
| } | ||
|  | ||
| private void verifyHtd(TableName tn, BackupInfo fullBackupInfo) throws IOException { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add javadoc to new methods where the goal is not clear from the name? (We know the existing code isn't a great example here.)
| 
 It's an interesting idea, but I do think that preventing mismatch CFs outright is a potentially easy way to avoid a bunch of edge cases down the line. It also keeps the implementation fairly simple. Speaking anecdotally, adding or removing CFs isn't a very frequent HBase operation we execute at my company. In which case, I think erring on the side of simplicity might be the correct path forward here. 
 At that point, I think you'd restore all the data from d1 and d2. I don't know if there's anyway to check the history of a table's Table descriptor. So at the moment we'd lose any information that happens between backups. That being said, I think the behavior makes sense if you consider backups a snapshot of a single point in time. When the full backup was taken, cf1 existed with d1. When the incremental backup was taken, cf1 existed with d2. So it makes sense to restore the full dataset. Similarly, if you're restoring from a full backup after you've deleted a CF that exists in the full backup, you'd be restoring more data than intended. | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| 
 Agreed. 
 I agree that a backup is a point-in-time snapshot. From that point of view, a backup shouldn't restore data that was not present at the time it was taken. It isn't the case for full backups, and shouldn't be the case for incremental backups either. (The different backup mechanisms should be just be implementation details to get the same behavior.) Possible solutions I see: 
 Anyway, none of the above needs to be solved in this ticket. This ticket is a nice improvement on its own. The above should be logged as a new ticket though (preferably after verifying that unintended data is being restored), so we're aware of it. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
…6358) Signed-off-by: Ray Mattingly <[email protected]> Co-authored-by: Hernan Romer <[email protected]> Co-authored-by: Hernan Gelaf-Romer <[email protected]>
…6358) Signed-off-by: Ray Mattingly <[email protected]> Co-authored-by: Hernan Romer <[email protected]> Co-authored-by: Hernan Gelaf-Romer <[email protected]>
…6357) Signed-off-by: Ray Mattingly <[email protected]> Co-authored-by: Hernan Romer <[email protected]> Co-authored-by: Hernan Gelaf-Romer <[email protected]>
…6358) (#6359) Signed-off-by: Ray Mattingly <[email protected]> Co-authored-by: Hernan Romer <[email protected]> Co-authored-by: Hernan Gelaf-Romer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @hgromer but we cannot ship this as is due to the api compatibility annotation.
| import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor; | ||
| import org.apache.yetus.audience.InterfaceAudience; | ||
|  | ||
| @InterfaceAudience.Public | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heya folks. This class hierarchy mixes IA public and private -- no can do. We can't just make its parent class, BackupException, public, because it exposes a non-public member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I'm more than happy to make this extends Exception instead. I'm not necessarily sure that this is an IOException, per-se
| try { | ||
| fs = FileSystem.get(new URI(fullBackupInfo.getBackupRootDir()), conf); | ||
| } catch (URISyntaxException e) { | ||
| throw new IOException("Unable to get fs", e); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "unable to get fs for backup foo."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address this
…6340) Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
…6340) Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
…6340) Co-authored-by: Hernan Gelaf-Romer <[email protected]> Signed-off-by: Ray Mattingly <[email protected]>
cc @rmdmattingly