- 
                Notifications
    You must be signed in to change notification settings 
- Fork 9.1k
Hadoop-17015. ABFS: Handling Rename and Delete idempotency #2021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Test results from accountin East US 2. Test results from HNS enabled account: Test results from Non-HNS enabled account: | 
        
          
                hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/DateTimeUtils.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/DateTimeUtils.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemDelete.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | return op; | ||
| } | ||
|  | ||
| /** | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to move the idempotency related methods to a separate Utility/Helper class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes are not utility related and are insync with the handling of the ABFS response. The reason they were included as separate methods was to enable mock testing which was otherwise not possible. Retaining the change as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option is restrict the method accessibility tp package level and use @VisibleForTesting annotation. Keeping a method public solely for testing doesn't look good practice.
Also the idea is, If you have methods 'assisting', chances are the class is actually doing too much. Moving these methods into separate classes with public interfaces keeps the class with the assisting methods responsible for one thing and one thing only (see Single Responsibility Principle). This move into separte classes automatically makes for a testable structure as your methods must be made public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AbfsClient class handles triggering of requests to Store backend and returning the AbfsRestOperation back. For Rename and Delete, the response to return is not determined if request was re-tried by the idempotent logic. It will be not right to consider these methods as "assisting" or providing a utility service and are part of the actual flow.
        
          
                ...op-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java
          
            Show resolved
            Hide resolved
        
              
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemDelete.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemDelete.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRename.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRename.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | 🎊 +1 overall 
 
 This message was automatically generated. | 
| testRenameTimeout(HTTP_NOT_FOUND, HTTP_NOT_FOUND, true); | ||
| } | ||
|  | ||
| private void testRenameTimeout( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find it reasonable to keep all the cases into the same method with if-else. I think separating the same could improve readability of the test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over the past PRs we have had several instances of reusable code being duplicated. These test cases share a lot of reusable code, hence the reusable code is put into a method. Have made some minor updates which might help improve readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I see common code as (1. Creating FS instance, 2. Creating testClient instance, 3. Creating mock AbfsRestOperation  instance, 4. Common assertion). This is clubbed with the non common codes (1. Creating http400Op  instance for one particular test case, 2. Creating http404Op  instance for another particular test case)
 Shall we move the common instances that are required to a separate private methods and call those methods in each test case? That would help improve readability and solved the problem of code duplication.
- I think we should relook and address the issues with the old PRs if those are recent, where resusable code is duplicated. Could you please create a workitem for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Creating a method, passing over all indiividual mock instances created back to test method will actually render code more unreadable. Will leave this comment open if any other reviewer feels the code is unreadable too.
- For issues that have come across for my code changes, i have made modifications with the respective PRs. Should definitely keep a look out for any that come across.
        
          
                ...ols/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsRestOperation.java
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the heading as follows.
HADOOP-17015. ABFS: Handling Rename and Delete idempotency
| 
 The following doesn't look right. | 
| @bilaharith Comments have been addressed. Please take a look. For the latest 
 Was a typo for 'retried'. Fixed it. | 
| 
 I believe you were highlighting a space needed before ABFS. Have made that change. | 
| Test results from latest updates: Non-HNS | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
        
          
                hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comments
        
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemDelete.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comment
        
          
                hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AbfsConfiguration.java
          
            Show resolved
            Hide resolved
        
      | 💔 -1 overall 
 
 This message was automatically generated. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please find responses above.
For yetus run, the failure is for shaded client.
[INFO] Apache Hadoop Client Aggregator .................... SUCCESS [  2.195 s]
[INFO] Apache Hadoop Client API ........................... SUCCESS [02:14 min]
[INFO] Apache Hadoop Client Runtime ....................... SUCCESS [02:18 min]
[INFO] Apache Hadoop Client Packaging Invariants .......... FAILURE [  1.539 s]
[INFO] Apache Hadoop Client Test Minicluster .............. SUCCESS [04:03 min]
[INFO] Apache Hadoop Client Packaging Invariants for Test . FAILURE [  0.200 s]
[INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [ 22.661 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  14:14 min
[INFO] Finished at: 2020-05-14T18:18:19Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.1:exec (check-jar-contents) on project hadoop-client-check-invariants: Command execution failed. Process exited with an error: 1 (Exit value: 1) -> [Help 1]
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce (enforce-banned-dependencies) on project hadoop-client-check-test-invariants: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1]
Failure is for enforce-banned-dependecies on hadoop-client-check-invariants. There are no updates to dependencies in this PR and even if hadoop-azure is a dependent module for hadoop-client-invariants.
Please go ahead and review and I will sort this one out of one of the committers. (to see if any action is needed.)
| assertEquals(DelegatingSSLSocketFactory.SSLChannelMode.OpenSSL, localAbfsConfiguration.getPreferredSSLFactoryOption()); | ||
| } | ||
|  | ||
| public static AbfsConfiguration updateRetryConfigs(AbfsConfiguration abfsConfig, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this class is the right place for this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the unit test class for AbfsConfiguration. I do not want to create separate utility class just to return a test instance of main class.
| .contains(DEFAULT_VALUE_UNKNOWN); | ||
| } | ||
|  | ||
| public static AbfsClient createTestClientFromCurrentContext( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same doubt as the above comment.
TestAbfsClient should have test methods the AbfsClient and supporting methods.
This method looks more like a Utility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, dont want to create a separate test utility class for AbfsClient alone to return a test instance and hence have placed it in the unit test class for AbfsClient.
For future tests that will need to mock or create new instances, it will be easy to check respective unit test class for any method available.
        
          
                ...hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemDelete.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| Latest test results: Non-HNS | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| Looks good! @snvijaya please resolve the conflicts then I will help to commit. | 
| Thanks @DadanielZ . Have resolved conflicts and updated PR. | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| LGTM, +1. Thanks! | 
Requests failing due to server timeouts and network failures are retried. At times the request might have succeeded at the server end when client received a timeout. In such case, a retry done for Rename or Delete operation will end in an user error for HTTP 404 (Not Found). In case of Rename, user error is NOT_FOUND as the source file is already renamed and is no longer available.
When the user error is thrown back to the calling application it mostly leads to job failure and is hard to debug from client side on how rename or delete ended up with an user error of file/folder not being present.
Rest of the PUT/POST operations get executed again at server end and maintains the idempotency state and needs no handling in ABFS Driver.
In this PR:
Mock tests to mimic a retried rename/delete request are added.