Skip to content

Conversation

@steveloughran
Copy link
Contributor

make rename/3 public. Rollup of #743 commits rebased to trunk

@steveloughran
Copy link
Contributor Author

full commit logs of squashed commits


PR with the previous patches.

This is a WiP but ready for some iniital review; lacks tests & spec.

Also, because the base FileSystem.rename/3 does its own src/dest checks, it's less efficient against object stores. They need their own high-perf subclass.

Change-Id: I1586ef2290d7a3d2d33b1a32e2f0999b07c26143

HADOOP-11452 rename/3 has tests for local fs, raw local and hdfs.

  • updates docs
  • contains fix for HADOOP-16255 : checksum FS doesn't do rename/3 properly

tests are based on those of rename; there's still some stuff about renaming into an empty directory which makes me suspect there's ambiguity there

Change-Id: Ic1ca6cbe3e9ca80ab8e1459167a7012678e856fc

HADOOP-11452 rename(path, path, options) to become public

  • exceptions match expectations of FSMainOperationsBaseTest
  • clean up FSMainOperationsBaseTest by moving to intercept, try with resources.
  • S3A to implement subclass of FSMainOperationsBaseTest, ITestS3AFSMainOperations
  • S3A to implement ITestS3AContractRenameEx
  • Add protected override point, executeInnerRename, for implementing rename/3; base class calls rename() and throws if that returns false, i.e. current behaviour
  • S3A overrides executeInnerRename to call its own innerRename() and so raise all failures as IOEs.

Issues: is the rename point executeInnerRename() a good name?

S3AFS should really implement a direct rename/3 so there's no duplication of checks for parent etc, maybe even passing
the values down. Or we make sure innerRename() is consistent with the spec, which primarily means logic about dest dir existing.

Change-Id: I0ac3695434d85072ab860854e5e88bc6d36e754a

HADOOP-11452 trying to move rename/3 logic into its own class

Change-Id: If2ab67152e08e4c2a225f6e89c24a5d1ff79ee59

HADOOP-15183: S3AFileSystem does rename/3

Factored the rename check logic out into a RenameHelper which is then used in S3A FileSystem as the PoC of how it can use the RenameHelper then directly invoke the inner operations. Added some more @Retry attributes as appropriate.

Conclusion: it works, but for efficient IOPS then innerRename() needs to take optional source and dest status values and so so omit repeating the checks.

For more work on this

  • the tests; file context has some so review and add to AbstractContractRenameTest. Also, based on some feedback from Sean Mackrory: verify the renamed files actually have the data.
  • move internal use of rename (distcp, maprv2 FileOutputcommitter), etc to use this.

of the 50+ places which call rename, they seem split 3-ways int

  1. subclasses and forwarding
  2. invocations whch check the return value and throw an IOE
  3. invocations which are written on the assumption that renames raise exceptions on failure

2 & 3 are the ones to change.

Change-Id: Id77ed7352b9d5ddb124f9191c5c5f1b8a76da7bb

HADOOP-11452. Rename

Review of RenameHelper based on current coding styles and
plans (IOStats, etc)

Change-Id: I3d39ee3ed04a10e7db2c2b2c79833b945b4d691b

HADOOP-11452 Rename/3

S3A high performance edition.

This avoids all surplus S3 calls and has meaningful exception
raising.

TODO:

  • pull the S3A code out into is own operation + extra callbacks
    (innerGetFileStatus is all that's needed)
  • see if the FileContext default logic can be pulled out too, using
    a custom set of callbacks. If it can't the logic is broken.
  • do some testing

Change-Id: I408b2cfe93f266cf0c9084fa8f05bb84b65c2bad

HADOOP-11452 Rename/3

  • Add RawLocalFileSystem rename with useful errors
  • pull out all rename docs into their own filesystem.md doc
  • Add a callback impl which => FileContext too, at least for
    the nonatomic rename. FC doesn't do path name checking. Should we?

Proposed changes

  • move the new interfaces up to o.a.h.fs, so that .impl is never
    imported in FileSystem APIS.
  • remove the createRename callbacks method, just have stores
    with implement rename/3 other than the base FS to override all of
    rename 3.

Change-Id: I1fab598553b8e9de4d659b80248bac440dbac018

@steveloughran steveloughran force-pushed the filesystem/HADOOP-11452-rename branch 2 times, most recently from 85015d4 to 744b0b8 Compare March 2, 2021 15:41
@steveloughran
Copy link
Contributor Author

Quick review of this, especially the factoring out

Good:

  • unified logic in one place
    Bad
  • it's still complex
  • it's not easy for S3A to optimise further

Looking at s3a rename I now want

  • getFileStatus on source file to return the metadata so that there's no second call on the copy
  • probe for parent dir to know it expects a dir. e.g callback getDirStatus(dir).
  • if the source is a dir, to initiate a LIST and have that feed straight into the object scan. That's tricky, but you know...
  • option to turn off the check for the dir existing.
  • async fetch of all the probes

so: all the callbacks would be completable futures; default impl would be sequential, but for the stores we can go async.

  • split into API (o.a.h.fs.api) and impl and have something in API To be the factory for the impl: downstream classes do not need to look into .impl.

No timetable for this.

@steveloughran steveloughran force-pushed the filesystem/HADOOP-11452-rename branch from 744b0b8 to 3e36589 Compare April 25, 2021 19:24
@jteagles
Copy link
Contributor

@steveloughran, is this still active. I'm interested in using this functionality and would be willing to help review.

@steveloughran
Copy link
Contributor Author

steveloughran commented Aug 4, 2021

I have stopped working on this. Feel free to take it up

I originally thought "hey, we could just make this public and there'd be a good rename", but as usual the challenge becomes one of strictly implementing the preconditions. FileContext does that, though non-atomically; Factoring out all that policy at least makes things consistent.

But, having dealt with other rename related trouble recently, I'm thinking really I'd want a new builder-based rename

Future<RenameOutcome> foutcome = FS.renamePath(source, dest)
   .must("fs.opt.rename.atomic", true)
   .may("fs.opt.rename.etag", "afee338b")
   .build()

RenameOutcome outcome = outcome.get();


class RenameOutcome implements IOStatisticsSource {

}

Why this?

  • Allows for stores which count IO Costs of renames to report them
  • You need some kind of return type for java futures
  • builder options would let you say whether you MUST have atomic rename, in which case
    -no s3a, wasb or gcs rename for you.
  • add ability to pass in things like etags

Why async?

  • so slow stores can be obviously slow about it
  • let you pass in a progressable. Distcp could do this so stop tasks failing during rename of big files on non -direct uploads to s3; same for FileOutputCommitter
Future<RenameOutcome> foutcome = FS.renamePath(source, dest)
   .withProgress(reporter)
   .build()

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 9 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 2s Maven dependency ordering for branch
+1 💚 mvninstall 27m 28s trunk passed
+1 💚 compile 23m 1s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 20m 23s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 3m 50s trunk passed
+1 💚 mvnsite 5m 33s trunk passed
-1 ❌ javadoc 1m 13s /branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-common in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
+1 💚 javadoc 4m 16s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 10m 8s trunk passed
+1 💚 shadedclient 20m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 30s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 20s /patch-mvninstall-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ mvninstall 0m 48s /patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ mvninstall 0m 21s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 0m 42s /patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
-1 ❌ javac 0m 42s /patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
-1 ❌ compile 0m 41s /patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ javac 0m 41s /patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 7 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 3m 15s /results-checkstyle-root.txt root: The patch generated 76 new + 385 unchanged - 23 fixed = 461 total (was 408)
-1 ❌ mvnsite 0m 24s /patch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ mvnsite 0m 52s /patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ mvnsite 0m 23s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-common in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt hadoop-common in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ javadoc 0m 26s /patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ spotbugs 0m 22s /patch-spotbugs-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ spotbugs 0m 51s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ spotbugs 0m 22s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ shadedclient 3m 34s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 22s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+1 💚 unit 2m 22s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 0m 51s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ unit 0m 22s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 asflicense 0m 31s The patch does not generate ASF License warnings.
164m 10s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/2/artifact/out/Dockerfile
GITHUB PR #2735
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 7e7b94ea712e 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 73ae1f6
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/2/testReport/
Max. process+thread count 557 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran marked this pull request as draft December 16, 2022 14:24
@steveloughran steveloughran force-pushed the filesystem/HADOOP-11452-rename branch from 73ae1f6 to da3d476 Compare January 30, 2023 12:16
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 9 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 15m 13s Maven dependency ordering for branch
+1 💚 mvninstall 33m 31s trunk passed
+1 💚 compile 25m 14s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 22m 58s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 4m 2s trunk passed
+1 💚 mvnsite 6m 25s trunk passed
-1 ❌ javadoc 1m 10s /branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-common in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
+1 💚 javadoc 4m 52s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 12m 19s trunk passed
+1 💚 shadedclient 26m 48s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 29s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 20s /patch-mvninstall-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ mvninstall 0m 51s /patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ mvninstall 0m 18s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 0m 44s /patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
-1 ❌ javac 0m 44s /patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
-1 ❌ compile 0m 40s /patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ javac 0m 40s /patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 7 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 3m 35s /results-checkstyle-root.txt root: The patch generated 76 new + 385 unchanged - 23 fixed = 461 total (was 408)
-1 ❌ mvnsite 0m 24s /patch-mvnsite-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ mvnsite 0m 54s /patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ mvnsite 0m 21s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 0m 20s /patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-common in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt hadoop-common in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ javadoc 0m 25s /patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08.
-1 ❌ spotbugs 0m 21s /patch-spotbugs-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
-1 ❌ spotbugs 0m 51s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ spotbugs 0m 20s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ shadedclient 2m 41s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 22s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch failed.
+1 💚 unit 2m 21s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 0m 52s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
+1 💚 unit 23m 23s hadoop-hdfs-rbf in the patch passed.
-1 ❌ unit 0m 23s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
213m 13s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/3/artifact/out/Dockerfile
GITHUB PR #2735
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 76efe0cb5cb6 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / da3d476
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/3/testReport/
Max. process+thread count 3334 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-rbf hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

PR with the previous patches.

Also, because the base FileSystem.rename/3 does its own src/dest checks, it's less efficient against object stores. They need their own high-perf subclass.

Change-Id: I1586ef2290d7a3d2d33b1a32e2f0999b07c26143

HADOOP-11452 rename/3 has tests for local fs, raw local and hdfs.

+ updates docs
+ contains fix for HADOOP-16255 : checksum FS doesn't do rename/3 properly

tests are based on those of rename; there's still some stuff about renaming into an empty directory which makes me suspect there's ambiguity there

Change-Id: Ic1ca6cbe3e9ca80ab8e1459167a7012678e856fc

HADOOP-11452 rename(path, path, options) to become public

* exceptions match expectations of FSMainOperationsBaseTest
* clean up FSMainOperationsBaseTest by moving to intercept, try with resources.
* S3A to implement subclass of FSMainOperationsBaseTest, ITestS3AFSMainOperations
* S3A to implement ITestS3AContractRenameEx
* Add protected override point, executeInnerRename,  for implementing rename/3; base class calls rename() and throws if that returns false, i.e. current behaviour
* S3A overrides executeInnerRename to call its own innerRename() and so raise all failures as IOEs.

Issues: is the rename point executeInnerRename() a good name?

S3AFS should really implement a direct rename/3 so there's no duplication of checks for parent etc, maybe even passing
the values down.  Or we make sure innerRename() is consistent with the spec, which primarily means logic about dest dir existing.

Change-Id: I0ac3695434d85072ab860854e5e88bc6d36e754a

HADOOP-11452 trying to move rename/3 logic into its own class

Change-Id: If2ab67152e08e4c2a225f6e89c24a5d1ff79ee59

HADOOP-15183:  S3AFileSystem does rename/3

Factored the rename check logic out into a RenameHelper which is then used in S3A FileSystem as  the PoC of how it can use the RenameHelper then directly invoke the inner operations. Added some more @Retry attributes as appropriate.

Conclusion: it works, but for efficient IOPS then `innerRename()` needs to take optional source and dest status values and so so omit repeating the checks.

For more work on this

* the tests; file context has some so review and add to AbstractContractRenameTest. Also, based on some feedback from Sean Mackrory: verify the renamed files actually have the data.
* move internal use of rename (distcp, maprv2 FileOutputcommitter), etc to use this.

of the 50+ places which call rename, they seem split 3-ways int

1. subclasses and forwarding
2. invocations whch check the return value and throw an IOE
3. invocations which are written on the assumption that renames raise exceptions on failure

2 & 3 are the ones to change.

Change-Id: Id77ed7352b9d5ddb124f9191c5c5f1b8a76da7bb

HADOOP-11452. Rename

Review of RenameHelper based on current coding styles and
plans (IOStats, etc)

Change-Id: I3d39ee3ed04a10e7db2c2b2c79833b945b4d691b

HADOOP-11452 Rename/3

S3A high performance edition.

This avoids all surplus S3 calls and has meaningful exception
raising.

TODO:
* pull the S3A code out into is own operation + extra callbacks
  (innerGetFileStatus is all that's needed)
* see if the FileContext default logic can be pulled out too, using
  a custom set of callbacks. If it can't the logic is broken.
* do some testing

Change-Id: I408b2cfe93f266cf0c9084fa8f05bb84b65c2bad

HADOOP-11452 Rename/3

* Add RawLocalFileSystem rename with useful errors
* pull out all rename docs into their own filesystem.md doc
* Add a callback impl which => FileContext too, at least for
  the nonatomic rename. FC doesn't do path name checking. Should we?

Proposed changes

* move the new interfaces up to o.a.h.fs, so that .impl is never
  imported in FileSystem APIS.
* remove the createRename callbacks method, just have stores
  with implement rename/3 other than the base FS to override all of
  rename 3.

Change-Id: I1fab598553b8e9de4d659b80248bac440dbac018
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 12m 15s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 9 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 52s Maven dependency ordering for branch
+1 💚 mvninstall 33m 37s trunk passed
+1 💚 compile 17m 29s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 compile 16m 27s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 checkstyle 4m 30s trunk passed
+1 💚 mvnsite 6m 34s trunk passed
+1 💚 javadoc 5m 30s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 5m 33s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 11m 34s trunk passed
+1 💚 shadedclient 36m 1s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 23s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 16m 14s /patch-compile-root-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt root in the patch failed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.
-1 ❌ javac 16m 14s /patch-compile-root-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt root in the patch failed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.
-1 ❌ compile 15m 53s /patch-compile-root-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt root in the patch failed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05.
-1 ❌ javac 15m 53s /patch-compile-root-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt root in the patch failed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05.
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 7 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 29s /results-checkstyle-root.txt root: The patch generated 67 new + 392 unchanged - 23 fixed = 459 total (was 415)
-1 ❌ mvnsite 0m 48s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 1m 14s /results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt hadoop-common-project_hadoop-common-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 generated 21 new + 0 unchanged - 0 fixed = 21 total (was 0)
-1 ❌ javadoc 0m 52s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2)
-1 ❌ javadoc 0m 49s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-120.04-b05 with JDK Private Build-1.8.0_422-8u422-b05-120.04-b05 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-1 ❌ spotbugs 0m 44s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 shadedclient 35m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 19m 38s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch passed.
+1 💚 unit 2m 47s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 220m 41s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
-1 ❌ unit 30m 54s /patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt hadoop-hdfs-rbf in the patch passed.
-1 ❌ unit 0m 57s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 asflicense 1m 14s The patch does not generate ASF License warnings.
551m 4s
Reason Tests
Failed junit tests hadoop.fs.viewfs.TestViewFileSystemWithAuthorityLocalFileSystem
hadoop.fs.viewfs.TestViewFsWithAuthorityLocalFs
hadoop.fs.viewfs.TestFcMainOperationsLocalFs
hadoop.fs.TestChecksumFs
hadoop.fs.TestSymlinkLocalFSFileContext
hadoop.fs.TestLocalFSFileContextCreateMkdir
hadoop.fs.viewfs.TestFcCreateMkdirLocalFs
hadoop.fs.contract.rawlocal.TestRawlocalContractRenameEx
hadoop.fs.TestTrash
hadoop.fs.viewfs.TestViewFileSystemLocalFileSystem
hadoop.fs.TestLocalFSFileContextMainOperations
hadoop.fs.viewfs.TestChRootedFs
hadoop.fs.viewfs.TestViewFsLocalFs
hadoop.fs.viewfs.TestFSMainOperationsLocalFileSystem
hadoop.fs.TestSymlinkLocalFSFileSystem
hadoop.fs.TestFSMainOperationsLocalFileSystem
hadoop.fs.viewfs.TestViewFsTrash
hadoop.fs.contract.localfs.TestLocalFSContractRenameEx
hadoop.fs.contract.hdfs.TestHDFSContractRename
hadoop.hdfs.server.namenode.TestCreateEditsLog
hadoop.hdfs.TestHDFSTrash
hadoop.fs.contract.router.TestRouterHDFSContractRename
hadoop.fs.contract.router.web.TestRouterWebHDFSContractRename
hadoop.fs.contract.router.TestRouterHDFSContractRenameSecure
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/4/artifact/out/Dockerfile
GITHUB PR #2735
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 7bcfce963ba5 5.15.0-117-generic #127-Ubuntu SMP Fri Jul 5 20:13:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 33c7641
Default Java Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/4/testReport/
Max. process+thread count 3858 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-rbf hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2735/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

closing issue, giving up on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants