Skip to content

Conversation

@steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Jul 17, 2019

This patch is the preamble to any fix: working out what is wrong.
With a side benefit of running the tests much faster.

  • Replaces the subclasses of the various committers with a paramterization
    of a single test. This is done via subclasses wihch define the specific test
    parameters and validate the output.
  • Which allows the tests to share the same cluster setup, rather than doing
    it once for each test
  • Makes the creation of an HDFS FS optional. As well as massively
    speeding up test setup, this means the logs of the test runs get collected.

Oh, and did I say it's faster? Not by much; it's still ~3 minutes, but I've cut the file size down on a scale test run (leave that for other testers).

Now all the logs from test runs go into target/yarn-${timestamp}; once you can find the relevant stdouts from the MR App master then you can see what's gone on. Added some details to the log messages to make it more meaningful

Testing: S3 Ireland with/without s3guard (dynamo)

@steveloughran steveloughran force-pushed the s3/HADOOP-16207-testMR branch from 98fb648 to 5c10ce9 Compare July 18, 2019 12:44
@steveloughran
Copy link
Contributor Author

Testing: S3 Ireland. Not retested since rebasing to trunk; will do when I have 1h to spare.

Currently this patch is not going to fix the testMR failures, what it will do is speed up test runs slightly, save the logs locally, and provide some more details on what it means when the MR job returns "false"

@steveloughran
Copy link
Contributor Author

Latest iteration should be more informative on a failure; I think I'd like to go one step further and see if I could actually put the logs of the failing MR app into the test log, which is a matter of:

Given the app Id and path of the yarn cluster logs, identify the stdout file of the first failing AM execution; load this and then log it such that it appears in the -output.txt file of the test. This will ensure that even a jenkins test failure where you can't see the failures will still include the job failure details

@steveloughran steveloughran requested a review from bgaborg July 19, 2019 13:54
@steveloughran steveloughran force-pushed the s3/HADOOP-16207-testMR branch from af32ecd to 27faf9a Compare July 22, 2019 12:54
@steveloughran steveloughran added the fs/s3 changes related to hadoop-aws; submitter must declare test endpoint label Jul 22, 2019
@steveloughran
Copy link
Contributor Author

Tested: S3 Ireland. No failures on my test run.

I do hope this modified test run will pick up on any failures which have been happening on other PRs, e.g #1123, so we can then track down what the failure was.

@steveloughran
Copy link
Contributor Author

Note that this test deletes four committer tests but only parameterizes three: directory, partitioned and magic. We don't do an explicit Staging committer, just its two subclasses. That's because those are the actual committers people are instructed to use, and we save one test run by cutting it.

@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
@apache apache deleted a comment from hadoop-yetus Sep 3, 2019
This patch is the preamble to any fix: working out what is wrong.
With a side benefit of running the tests much faster.

-Replaces the subclasses of the various committers with a paramterization
 of a single test. This is done via subclasses wihch define the specific test
 parameters and validate the output.
-Which allows the tests to share the same cluster setup, rather than doing
 it once for each test
-Makes the creation of an HDFS FS optional. As well as massively
 speeding up test setup, this means the logs of the test runs get collected.

Oh, and did I say it's faster?

Change-Id: I6c89424071e08cdb58ea5bf3e33418ada0e27b01
Change-Id: I30afdd3f1114ceded55ca43e47c701cff9ccdc54
Change-Id: Idd74dcef11da04c884f5d73347501c82a6380068
TODO: include the appId in the errors, as now there is >1 app in the same cluster, you need that.
Those about test names can be ignored, as we are using the numbering for executing tests in order

Change-Id: I998077eb6147737fd6deba4d5a0c5c802f4a23dd
+ fix IDE hints

Change-Id: I28ac4c744677bb6e5eae0c95dda0bafc7fbda208
…between this PR and trunk

Change-Id: I9a10e1852e79a427f8e635da224d496cdf1b5d4e
Tracked down bug where staging dir was in a temp path under the NM's dir tree, so even though the fs was shared, only tasks running on the same NM had any output to commit.

fix uses java tmp dir in junit process to set the staging dir property; also adds assertions for size of success file count so the problem is picked up on the specific job, rather than have the follow on job fail with a -1.

Change-Id: Iab0ebdbf8c87518c4274eb2d9bf8468670b957d7
but: runs are now creating a tmp/ dir under hadoop-aws for staging files. This is wrong -we should be picking up the specific one for that test fork
@apache apache deleted a comment from hadoop-yetus Sep 30, 2019
* turn off all checks the disk free space. Without those, as soon as you get
down to 40GB free, all minicluster tests fail.
Way back we fixed this for mini HDFS, I believe -but not here.
* create the test cluster on demand in the first setup() call which doesn't already have a cluster.
* use an explicit absolute staging directory for consistent access across all nodes in the mini cluster.

Change-Id: I41687100f68dd15625677d1ff7d1203854dc9061
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 2104 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 13 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1117 trunk passed
+1 compile 35 trunk passed
+1 checkstyle 25 trunk passed
+1 mvnsite 39 trunk passed
+1 shadedclient 752 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 60 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 58 trunk passed
_ Patch Compile Tests _
+1 mvninstall 33 the patch passed
+1 compile 30 the patch passed
+1 javac 30 the patch passed
-0 checkstyle 21 hadoop-tools/hadoop-aws: The patch generated 7 new + 13 unchanged - 1 fixed = 20 total (was 14)
+1 mvnsite 33 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 1 The patch has no ill-formed XML file.
+1 shadedclient 758 patch has no errors when building and testing our client artifacts.
+1 javadoc 25 the patch passed
+1 findbugs 65 the patch passed
_ Other Tests _
+1 unit 84 hadoop-aws in the patch passed.
+1 asflicense 31 The patch does not generate ASF License warnings.
5322
Subsystem Report/Notes
Docker Client=19.03.2 Server=19.03.2 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/17/artifact/out/Dockerfile
GITHUB PR #1115
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 4f7a2f76c87f 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 98ca07e
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/17/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/17/testReport/
Max. process+thread count 411 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/17/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

…on to the terasort tests.

This is mainly done by changes to the shared superclass and making AbstractCommitTerasortIT
non abstract and parameterized by committer name.

Change-Id: Id48b5bf3a16ba693ad7bc15836e30325855bd0dc
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 79 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 13 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1248 trunk passed
+1 compile 38 trunk passed
+1 checkstyle 27 trunk passed
+1 mvnsite 43 trunk passed
+1 shadedclient 949 branch has no errors when building and testing our client artifacts.
+1 javadoc 30 trunk passed
0 spotbugs 66 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 64 trunk passed
-0 patch 88 Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 mvninstall 33 the patch passed
+1 compile 34 the patch passed
-1 javac 34 hadoop-tools_hadoop-aws generated 1 new + 15 unchanged - 1 fixed = 16 total (was 16)
-0 checkstyle 19 hadoop-tools/hadoop-aws: The patch generated 8 new + 10 unchanged - 4 fixed = 18 total (was 14)
+1 mvnsite 30 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 2 The patch has no ill-formed XML file.
+1 shadedclient 933 patch has no errors when building and testing our client artifacts.
+1 javadoc 33 the patch passed
+1 findbugs 71 the patch passed
_ Other Tests _
+1 unit 78 hadoop-aws in the patch passed.
+1 asflicense 36 The patch does not generate ASF License warnings.
3843
Subsystem Report/Notes
Docker Client=19.03.0 Server=19.03.0 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/18/artifact/out/Dockerfile
GITHUB PR #1115
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 2ce16fddaa25 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 4d3c580
Default Java 1.8.0_222
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/18/artifact/out/diff-compile-javac-hadoop-tools_hadoop-aws.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/18/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/18/testReport/
Max. process+thread count 330 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/18/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@apache apache deleted a comment from hadoop-yetus Oct 1, 2019
@apache apache deleted a comment from hadoop-yetus Oct 1, 2019
@apache apache deleted a comment from hadoop-yetus Oct 1, 2019
@apache apache deleted a comment from hadoop-yetus Oct 1, 2019
pattern in the POM

Change-Id: I999240c2139f0f12ca53e9425faf9c9b97ba2ca1
@steveloughran
Copy link
Contributor Author

Full scale test:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  24:31 min (Wall Clock)
[INFO] Finished at: 2019-10-01T14:10:14+01:00
[INFO] ------------------------------------------------------------------------

@steveloughran
Copy link
Contributor Author

@sidseth @bgaborg This patch is ready to look at; shaves time off scale runs (24 min for me with Dparallel-tests -DtestsThreadCount=12 -Ds3guard -Ddynamo -Dauth -Dscale

@apache apache deleted a comment from hadoop-yetus Oct 1, 2019
@apache apache deleted a comment from hadoop-yetus Oct 1, 2019
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 2069 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 13 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1214 trunk passed
+1 compile 33 trunk passed
+1 checkstyle 24 trunk passed
+1 mvnsite 37 trunk passed
+1 shadedclient 869 branch has no errors when building and testing our client artifacts.
+1 javadoc 25 trunk passed
0 spotbugs 58 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 56 trunk passed
-0 patch 78 Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 mvninstall 34 the patch passed
+1 compile 26 the patch passed
-1 javac 26 hadoop-tools_hadoop-aws generated 1 new + 15 unchanged - 1 fixed = 16 total (was 16)
-0 checkstyle 18 hadoop-tools/hadoop-aws: The patch generated 7 new + 10 unchanged - 4 fixed = 17 total (was 14)
+1 mvnsite 30 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 1 The patch has no ill-formed XML file.
+1 shadedclient 864 patch has no errors when building and testing our client artifacts.
+1 javadoc 23 the patch passed
+1 findbugs 62 the patch passed
_ Other Tests _
+1 unit 72 hadoop-aws in the patch passed.
+1 asflicense 30 The patch does not generate ASF License warnings.
5565
Subsystem Report/Notes
Docker Client=19.03.2 Server=19.03.2 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/19/artifact/out/Dockerfile
GITHUB PR #1115
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 0aeb7fc0b186 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 425a6c8
Default Java 1.8.0_222
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/19/artifact/out/diff-compile-javac-hadoop-tools_hadoop-aws.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/19/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/19/testReport/
Max. process+thread count 340 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/19/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

This isn't to directly debug problems in this commit, but it lines up the
eest runs to collect more data needed to troubleshoot intermittent test failures.

Change-Id: I8f4c7258b62a45a7a1c376e0315b429dde08117c
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 76 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 14 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1225 trunk passed
+1 compile 31 trunk passed
+1 checkstyle 23 trunk passed
+1 mvnsite 37 trunk passed
+1 shadedclient 862 branch has no errors when building and testing our client artifacts.
+1 javadoc 25 trunk passed
0 spotbugs 57 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 54 trunk passed
-0 patch 77 Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 mvninstall 32 the patch passed
+1 compile 27 the patch passed
-1 javac 27 hadoop-tools_hadoop-aws generated 1 new + 15 unchanged - 1 fixed = 16 total (was 16)
-0 checkstyle 17 hadoop-tools/hadoop-aws: The patch generated 7 new + 10 unchanged - 4 fixed = 17 total (was 14)
+1 mvnsite 31 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 1 The patch has no ill-formed XML file.
+1 shadedclient 861 patch has no errors when building and testing our client artifacts.
+1 javadoc 23 the patch passed
+1 findbugs 61 the patch passed
_ Other Tests _
+1 unit 71 hadoop-aws in the patch passed.
+1 asflicense 29 The patch does not generate ASF License warnings.
3582
Subsystem Report/Notes
Docker Client=19.03.0 Server=19.03.0 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/20/artifact/out/Dockerfile
GITHUB PR #1115
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 3038f48da64b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 99cd757
Default Java 1.8.0_222
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/20/artifact/out/diff-compile-javac-hadoop-tools_hadoop-aws.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/20/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/20/testReport/
Max. process+thread count 319 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/20/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@sidseth sidseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title/commit message changes to 'Speed up MiniCluster jobs'?

#log4j.logger.org.apache.hadoop.fs.s3a.s3guard=DEBUG
# if set to debug, this will log the PUT/DELETE operations on a store
#log4j.logger.org.apache.hadoop.fs.s3a.s3guard.Operations=DEBUG
log4j.logger.org.apache.hadoop.fs.s3a.s3guard.Operations=DEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional? or left over from debugging.

public void setup() throws Exception {
super.setup();
requireScaleTestsEnabled();
prepareToTerasort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In prepareToTerasort - the config is modified via getYarn().getConfig.set() ...
This is going to change the config for every test that makes use of this cluster. Likely better to create a new Configuration instance for the Job being submitted, and setting these config parameters at the job level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point -I'll clone

* suite.
*/
@SuppressWarnings("StaticNonFinalField")
private static ClusterBinding clusterBinding;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how well this is going to work, given there's at least 2 tests which are inheriting from this class.

Either could end up creating the cluster, and then terminating the cluster in '@afterclass' without knowing what the other test is doing - and I think this will lead to failures depending on the timing of executing ITestTerasortOnS3A and ITestS3ACommitterMRJob.

These tests likely need there own clusters, and teardown methods. Or maybe the parameter can be made non-static, so that individual test classes don't mess with each other. (They get their own clusters as a result). Afaik, the individual test classes will still be able to share clusters.

Copy link
Contributor Author

@steveloughran steveloughran Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will need to look a bit closer at the JUnit runner and when there is after class methods are called. Before I did this cleanup I did actually have the bindings in each subclass, but at least in the test run this week it did actually work with everything in the superclass.

If I'm mistaken in my assumptions, I can just put the cleanup @afterclass into each of the subclasses, and call out in the javadocs that you have to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junit @AfterClass javadocs

 The @AfterClass methods declared in superclasses will be run after those of the current
 class, unless they are shadowed in the current class.

Nothing to worry about then; good to remember that ordering too. BeforeClass and Before are the opposite: superclasses run before the subclasses. What isn't documented is the ordering of before/after methods within the same class...I wouldn't rely on anything there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the concern I raised a little better now.

Will the two derived tests ever run in parallel on the same VM?

  • If so, there's a problem with the static cluster being in the base class - since the cluster will be shared and shutdown when either of the tests complete.

If parallel tests will only use separate VMs - we should be OK.

If a single VM is used to run the tests serially - the "clusterBinding=null" is an important change, which I think you've already made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we have parameterized tests there will never be >1 CommitMRJob
test running in parallel. That was a problem we had before, where you could have multiple ITest*CommitMRJob test runs spawning so many processes that your dev box comes to a halt if you do a many threaded run. With these settings the delegation token test runs can overlap with the CommitMRJob sequence -i plan to follow up this patch with parameterization there too.

As all the tests have opted not to use a mini HDFS cluster we avoid the extra overhead of those processes too

* JUnit runs these test suites one parameterized binding at a time.
* </li>
* <li>
* The test suites are declared to be executed in ascending order, so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little confused by the intent of this flow - i.e. test_000, test_100, test_200, test_500 in order.

  1. What happens if test_000 fails - will the rest of the tests still be run for the specific parameter?
  2. The names - test_000, test_100 etc don't say a lot about what is actually being done in the methods.

It may be easier to have a single test method - which takes care of setup, validation, etc - and allow override points within that for the individual committers. That would lead to a single test failing and not attempting to run the others after a previous dependent failure.

Also, mentioned making the cluster non-static in another comment. Given parameterized tests instantiate a new instance of the class each time - maybe just the '@afterclass' teerdown and cluster setup needs to move to this class (still as static).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test suite is tagged as @FixMethodOrder(MethodSorters.NAME_ASCENDING); they run in sequence. naming convention is as used elsewhere.

As to why the gaps between numbers: lets us put new values in without
renaming things. You never coded in Basic or APL?

Having separate tests

  • isolates test method timeouts
  • lets you debug individual operations.
  • makes it clear which operation failed.
    yes, the state of one is dependent on the others succeeding, which is why they try to have checks for the state of the store before they run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking: we could set markers on tests as they complete and so skip the successors. I'll do that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not concerned about the gap in numbering at all.

I understand the ordering enforced by the annotation. The terrasort test has this as well - and it makes sense to me there, at least to some extent, since each test is doing something independently, which can be verified (e.g. run teragen, then terasort, then teravalidate).

In this case - some of the methods are just performing setup operations for the actual test.

This could be structured as.

@Test testMRJob() {
   committerTestBinding.test_000();
   committerTestBinding.test_100();
   .. .. ..
}

That would make sure exactly one test runs, and fails the moment there's an error from any of the per committer bindings. Is also easier to debug I think - for someone new, if they see a failure on test_000, test_100, test_500 etc - they would not try to figure out each of the failures independently. Will also have context on the failures via the methodName + parameter.

IAC, that's my take on this. Have a preference for a single test approach, but not tied to it. Your plan to skip any successors solves the main concern as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am satisfied with what we have here -it is consistent with all the other large sequential test runs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. The change that you mentioned about not running tests if a previous test fails - if that's already in, I'm +1 for the patch.

* track stages executed and use an assume() to skip subsequent tests on a failure.
  Note: this makes it impossible to execute a single stage in the debugger.
* Terasort options are set in `applyCustomConfigOptions` for the jobconf
* use unique paths for the different parameterised runs (I'd missed that)
* javadocs
* terasort size options (partitions, rows, ...) set in constants.

Change-Id: I42a4ce46562641e8d942a81460a67eb01acf87aa
@steveloughran
Copy link
Contributor Author

Here is a new revision. Tests in progress.

  • track stages executed and use an assume() to skip subsequent tests on a failure. Note: this makes it impossible to execute a single stage in the debugger.
  • Terasort options are set in applyCustomConfigOptions for the jobconf
  • use unique paths for the different parameterised runs (I'd missed that)
  • javadocs
  • terasort size options (partitions, rows, ...) set in constants.

@steveloughran
Copy link
Contributor Author

Also removed all changes which went near the committer code itself, to avoid conflict with #1442. This is a test code only patch now

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 77 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 14 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1212 trunk passed
+1 compile 32 trunk passed
+1 checkstyle 23 trunk passed
+1 mvnsite 34 trunk passed
+1 shadedclient 909 branch has no errors when building and testing our client artifacts.
+1 javadoc 30 trunk passed
0 spotbugs 72 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 67 trunk passed
-0 patch 95 Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 mvninstall 37 the patch passed
+1 compile 32 the patch passed
-1 javac 32 hadoop-tools_hadoop-aws generated 1 new + 15 unchanged - 1 fixed = 16 total (was 16)
-0 checkstyle 23 hadoop-tools/hadoop-aws: The patch generated 7 new + 9 unchanged - 5 fixed = 16 total (was 14)
+1 mvnsite 37 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 2 The patch has no ill-formed XML file.
+1 shadedclient 934 patch has no errors when building and testing our client artifacts.
+1 javadoc 26 the patch passed
+1 findbugs 72 the patch passed
_ Other Tests _
+1 unit 98 hadoop-aws in the patch passed.
+1 asflicense 34 The patch does not generate ASF License warnings.
3801
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/21/artifact/out/Dockerfile
GITHUB PR #1115
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux e7816060857d 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 61a8436
Default Java 1.8.0_222
javac https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/21/artifact/out/diff-compile-javac-hadoop-tools_hadoop-aws.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/21/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/21/testReport/
Max. process+thread count 306 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1115/21/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@sidseth
Copy link
Contributor

sidseth commented Oct 4, 2019

The change that you mentioned about not running tests if a previous test fails - if that's already in, I'm +1 for the patch.

@steveloughran
Copy link
Contributor Author

thanks -merged in!

@steveloughran steveloughran deleted the s3/HADOOP-16207-testMR branch October 15, 2021 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants