HDDS-1649. On installSnapshot notification from OM leader, download checkpoint and reload OM state #948

hanishakoneru · 2019-06-11T21:21:51Z

Unit tests pending.

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java

arp7 · 2019-06-13T22:16:30Z

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java

We should not execute this in the default ForkJoinPool. That can suffer from thread exhaustion/deadlock issues since there are very few threads in the default pool.

Instead use the overload of supplyAsync that accepts an Executor.

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

arp7 · 2019-06-13T22:22:23Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

One question: currently we are passing leaderId = null. So will this call to getOzoneManagerDBSnapshot fail?

This was temporary till RATIS-564. Fixed now.

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

arp7 · 2019-06-13T22:55:37Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

One risk is that this code path for stop may not be well tested.

Yes, also the correct way would be to pause the Ratis server and not stop it. Need a Ratis patch to support pause.

arp7 · 2019-06-13T22:56:37Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

omDoubleBuffer.stop interrupts the thread but does not call join. So the doubleBuffer thread may still be running when the stop call returns.

We should probably fix stop to call join.

arp7 · 2019-06-13T22:57:30Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

This is going to be a directory correct? Since I assume a DB is multiple files.

Yes. Updated variable names.

arp7 · 2019-06-13T22:58:23Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

I didn't understand this TODO. Could you clarify a bit more?

InstallSnaphsot notification response requires the TermIndex. But we only have the install snapshot index. The term index is irrelevant here. We return a dummy (0) term index.

arp7 · 2019-06-13T23:00:10Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

Can some of this code be shared with startup initialization by moving to a common function?

hadoop-yetus · 2019-07-17T23:21:34Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	37	Docker mode activated.
		_ Prechecks _
+1	dupname	0	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	21	Maven dependency ordering for branch
+1	mvninstall	467	trunk passed
+1	compile	259	trunk passed
+1	checkstyle	73	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	866	branch has no errors when building and testing our client artifacts.
+1	javadoc	162	trunk passed
0	spotbugs	316	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	505	trunk passed
-0	patch	365	Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
		_ Patch Compile Tests _
0	mvndep	35	Maven dependency ordering for patch
-1	mvninstall	138	hadoop-ozone in the patch failed.
-1	compile	56	hadoop-ozone in the patch failed.
-1	javac	56	hadoop-ozone in the patch failed.
-0	checkstyle	41	hadoop-ozone: The patch generated 5 new + 0 unchanged - 0 fixed = 5 total (was 0)
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	2	The patch has no ill-formed XML file.
+1	shadedclient	680	patch has no errors when building and testing our client artifacts.
+1	javadoc	162	the patch passed
-1	findbugs	104	hadoop-ozone in the patch failed.
		_ Other Tests _
-1	unit	279	hadoop-hdds in the patch failed.
-1	unit	55	hadoop-ozone in the patch failed.
+1	asflicense	43	The patch does not generate ASF License warnings.
		4599

Reason	Tests
Failed junit tests	hadoop.hdds.scm.container.placement.algorithms.TestContainerPlacementFactory
	hadoop.ozone.om.exceptions.TestResultCodes

Subsystem	Report/Notes
Docker	Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname	Linux cee87f92baa9 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `5e6cc6f`
Default Java	1.8.0_212
mvninstall	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/patch-mvninstall-hadoop-ozone.txt
compile	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/patch-compile-hadoop-ozone.txt
javac	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/patch-compile-hadoop-ozone.txt
checkstyle	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/diff-checkstyle-hadoop-ozone.txt
findbugs	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/patch-findbugs-hadoop-ozone.txt
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/patch-unit-hadoop-hdds.txt
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/testReport/
Max. process+thread count	535 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/2/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

arp7 · 2019-07-18T16:48:12Z

Reviewing this patch, @hanishakoneru can you resolve the merge conflict meanwhile?

arp7 · 2019-07-18T16:51:24Z

hadoop-hdds/common/src/main/resources/ozone-default.xml

Does this mean we will snapshot every 1024 transactions?

No, when a snapshot is being taken, if the gap between log purges is more than 1024, then it will purge the logs. Snapshot frequency is not dependent on this.

Let's set this to a higher value. We don't need to be too aggressive about purging Ratis logs.

1024 transactions is 100ms worth of edits in a busy cluster. We could set this as high as 1M maybe to keep more history. :)

Agree. Will update it.

arp7 · 2019-07-18T16:54:34Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java

Nice catch!

arp7 · 2019-07-18T17:09:57Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

When would they be null?

If ratis is not enabled. This is for the non-HA code path.

arp7 · 2019-07-18T17:12:25Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

Should we restore the original files?

arp7 · 2019-07-18T17:13:16Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

@bharatviswa504 can you take a look at this part to see if we need to instantiate anything else?

if (isAclEnabled) {
accessAuthorizer = getACLAuthorizerInstance(conf);
if (accessAuthorizer instanceof OzoneNativeAuthorizer) {
OzoneNativeAuthorizer authorizer =
(OzoneNativeAuthorizer) accessAuthorizer;
authorizer.setVolumeManager(volumeManager);
authorizer.setBucketManager(bucketManager);
authorizer.setKeyManager(keyManager);
authorizer.setPrefixManager(prefixManager);
}
} else {
accessAuthorizer = null;
}

We should do this part also during reloadOMState.

arp7 · 2019-07-18T17:15:13Z

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java

This TODO looks a little worrying. Something we need to address now?

Yes thanks for catching this. On startup, we should read the saved ratis snapshot index from disk. I will update the patch.

Correction: On reloading state, we should not read the saved snaphsot index. Instead, we should updated the snapshot index on disk.
During normal startup, we already read the saved snapshot index.

bharatviswa504 · 2019-07-18T17:15:34Z

Question:
In ShouldInstallSnapshot, it calls getLatestSnapshot() from stateMachineStorage, as we have our own snapshot implementation in stateMachine, do we need to override that method to provide correct snapshotInfo? Or could you provide some info how this works?

bharatviswa504 · 2019-07-18T00:02:00Z

...op-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOMRatisSnapshots.java

Minor: MiniOzoneCluster

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java

bharatviswa504 · 2019-07-18T16:55:27Z

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java

We have set the lifeCycle State here, but I don't see how this will pause stateMachine.
As this state is not being used anywhere else except during initliaze and unpause.

It is taken care of internally by Ratis. The StateMachineUpdater in Ratis checks if the state is RUNNING before applying log entries to StateMachine.

arp7 · 2019-07-18T17:18:54Z

I am mostly +1 on this change. Couple of minor comments and one thing I requested Bharat to double check.

bharatviswa504 · 2019-07-18T17:42:04Z

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java

Shutdown of this executor needs to be done in StateMachine stop.

bharatviswa504 · 2019-07-18T17:51:07Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

If checkpointSnapshotIndex <= lastAppliedIndex, I think here we need to clean up the DB checkpoint which is downloaded

bharatviswa504 · 2019-07-18T18:04:26Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

Why do we need to delete this file here?

We should delete the old metrics file as the metrics are no longer valid. We should also save the new metrics. Updated the patch.

bharatviswa504 · 2019-07-18T18:18:48Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

Where this is being used?

This is by mistake. Not sure how this came here. Removing it.

bharatviswa504 · 2019-07-18T18:21:49Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

If it is successfully done, when can we delete back up?
Might be at end of notifyInstallSnapshot

bharatviswa504 · 2019-07-18T18:28:54Z

...op-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOMRatisSnapshots.java

Just trying to understand, as according to your tests with log purge gap 50 and snapshot interval 50. Old logs will be purged. So, when inactive OM start's ratis will internally makes the notifyInstallSnapshot and do reload automatically right. Why in this test it is being done manually, is this just to test the steps. If so, can we have the test automatically to do so by ratis. (I mean have some wait, until it is done, and then check DB, not sure for IT this is too much)

Yes this was for the test only. The problem with testing end to end is that HttpGet does not work for unit tests. The CheckpointServlet (from Recon) which we use for downloading the checkpoint from leader uses HttpGet for the checkpoint transfer.

hanishakoneru · 2019-07-18T20:48:03Z

Question:
In ShouldInstallSnapshot, it calls getLatestSnapshot() from stateMachineStorage, as we have our own snapshot implementation in stateMachine, do we need to override that method to provide correct snapshotInfo? Or could you provide some info how this works?

We do not want the snapshots to be handled via Ratis. When a follower receives installSnaphsot notification, it sends the new loaded DB's snapshot index back to the leader. The leader updates the followers snaphsot index through this.

But when Ratis server is starting up, it should be able to determine the latest snapshot index. Otherwise, all the logs will be replayed from the start. I will create a new Jira to address this. Thanks Bharat.

hadoop-yetus · 2019-07-18T23:35:57Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	70	Docker mode activated.
		_ Prechecks _
+1	dupname	0	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	22	Maven dependency ordering for branch
+1	mvninstall	487	trunk passed
+1	compile	305	trunk passed
+1	checkstyle	74	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	965	branch has no errors when building and testing our client artifacts.
+1	javadoc	164	trunk passed
0	spotbugs	349	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	553	trunk passed
		_ Patch Compile Tests _
0	mvndep	32	Maven dependency ordering for patch
-1	mvninstall	160	hadoop-ozone in the patch failed.
-1	compile	62	hadoop-ozone in the patch failed.
-1	cc	62	hadoop-ozone in the patch failed.
-1	javac	62	hadoop-ozone in the patch failed.
+1	checkstyle	84	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	2	The patch has no ill-formed XML file.
+1	shadedclient	756	patch has no errors when building and testing our client artifacts.
+1	javadoc	180	the patch passed
-1	findbugs	108	hadoop-ozone in the patch failed.
		_ Other Tests _
+1	unit	339	hadoop-hdds in the patch passed.
-1	unit	118	hadoop-ozone in the patch failed.
+1	asflicense	38	The patch does not generate ASF License warnings.
		5149

Subsystem	Report/Notes
Docker	Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc
uname	Linux 5fd557cb2f6c 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `d5ef38b`
Default Java	1.8.0_212
mvninstall	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/patch-mvninstall-hadoop-ozone.txt
compile	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/patch-compile-hadoop-ozone.txt
cc	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/patch-compile-hadoop-ozone.txt
javac	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/patch-compile-hadoop-ozone.txt
findbugs	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/patch-findbugs-hadoop-ozone.txt
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/testReport/
Max. process+thread count	411 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/3/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2019-07-19T00:29:37Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	98	Docker mode activated.
		_ Prechecks _
+1	dupname	1	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	24	Maven dependency ordering for branch
+1	mvninstall	527	trunk passed
+1	compile	269	trunk passed
+1	checkstyle	76	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	959	branch has no errors when building and testing our client artifacts.
+1	javadoc	182	trunk passed
0	spotbugs	375	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	606	trunk passed
		_ Patch Compile Tests _
0	mvndep	33	Maven dependency ordering for patch
+1	mvninstall	442	the patch passed
+1	compile	279	the patch passed
+1	cc	279	the patch passed
+1	javac	279	the patch passed
+1	checkstyle	85	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	1	The patch has no ill-formed XML file.
+1	shadedclient	764	patch has no errors when building and testing our client artifacts.
+1	javadoc	169	the patch passed
+1	findbugs	537	the patch passed
		_ Other Tests _
+1	unit	336	hadoop-hdds in the patch passed.
-1	unit	2175	hadoop-ozone in the patch failed.
+1	asflicense	55	The patch does not generate ASF License warnings.
		7789

Reason	Tests
Failed junit tests	hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures
	hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException
	hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
	hadoop.ozone.om.snapshot.TestOzoneManagerSnapshotProvider
	hadoop.ozone.client.rpc.TestFailureHandlingByClient
	hadoop.ozone.client.rpc.TestOzoneRpcClient
	hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures
	hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
	hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient
	hadoop.ozone.TestSecureOzoneCluster
	hadoop.ozone.container.server.TestSecureContainerServer
	hadoop.ozone.client.rpc.Test2WayCommitInRatis
	hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer
	hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
	hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory

Subsystem	Report/Notes
Docker	Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/4/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc
uname	Linux f2938c9c233e 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `d545f9c`
Default Java	1.8.0_212
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/4/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/4/testReport/
Max. process+thread count	4968 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/4/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2019-07-19T01:23:54Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	38	Docker mode activated.
		_ Prechecks _
+1	dupname	1	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	66	Maven dependency ordering for branch
+1	mvninstall	482	trunk passed
+1	compile	265	trunk passed
+1	checkstyle	77	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	865	branch has no errors when building and testing our client artifacts.
+1	javadoc	187	trunk passed
0	spotbugs	316	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	509	trunk passed
		_ Patch Compile Tests _
0	mvndep	74	Maven dependency ordering for patch
+1	mvninstall	435	the patch passed
+1	compile	271	the patch passed
+1	cc	271	the patch passed
+1	javac	271	the patch passed
+1	checkstyle	73	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	1	The patch has no ill-formed XML file.
+1	shadedclient	648	patch has no errors when building and testing our client artifacts.
+1	javadoc	159	the patch passed
+1	findbugs	536	the patch passed
		_ Other Tests _
+1	unit	277	hadoop-hdds in the patch passed.
-1	unit	1639	hadoop-ozone in the patch failed.
+1	asflicense	50	The patch does not generate ASF License warnings.
		6814

Reason	Tests
Failed junit tests	hadoop.ozone.TestSecureOzoneCluster
	hadoop.ozone.om.snapshot.TestOzoneManagerSnapshotProvider
	hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
	hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
	hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient
	hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer
	hadoop.ozone.client.rpc.TestFailureHandlingByClient
	hadoop.ozone.client.rpc.TestOzoneRpcClient
	hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
	hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures
	hadoop.ozone.container.server.TestSecureContainerServer

Subsystem	Report/Notes
Docker	Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/5/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc
uname	Linux 0196a0d19d2c 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `d545f9c`
Default Java	1.8.0_212
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/5/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/5/testReport/
Max. process+thread count	5388 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/5/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

arp7 · 2019-07-19T02:16:25Z

/retest

anuengineer · 2019-07-19T02:44:21Z

hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/OMConfigKeys.java

Can we please use the new format for configs? Here are some examples: https://cwiki.apache.org/confluence/display/HADOOP/Java-based+configuration+API

Good suggestion! Let me file a followup jira to fix that. Want to get this patch committed today, it's been hanging around for over a month.

Filed HDDS-1831.

hadoop-yetus · 2019-07-19T14:39:43Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	83	Docker mode activated.
		_ Prechecks _
+1	dupname	1	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	23	Maven dependency ordering for branch
+1	mvninstall	524	trunk passed
+1	compile	254	trunk passed
+1	checkstyle	66	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	837	branch has no errors when building and testing our client artifacts.
+1	javadoc	149	trunk passed
0	spotbugs	326	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	522	trunk passed
		_ Patch Compile Tests _
0	mvndep	31	Maven dependency ordering for patch
+1	mvninstall	438	the patch passed
+1	compile	274	the patch passed
+1	cc	274	the patch passed
+1	javac	274	the patch passed
+1	checkstyle	76	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	2	The patch has no ill-formed XML file.
+1	shadedclient	638	patch has no errors when building and testing our client artifacts.
+1	javadoc	154	the patch passed
+1	findbugs	571	the patch passed
		_ Other Tests _
-1	unit	239	hadoop-hdds in the patch failed.
-1	unit	2317	hadoop-ozone in the patch failed.
+1	asflicense	46	The patch does not generate ASF License warnings.
		7392

Reason	Tests
Failed junit tests	hadoop.ozone.container.ozoneimpl.TestOzoneContainer
	hadoop.ozone.container.server.TestSecureContainerServer
	hadoop.ozone.om.snapshot.TestOzoneManagerSnapshotProvider
	hadoop.ozone.TestSecureOzoneCluster
	hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
	hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer
	hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures
	hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
	hadoop.hdds.scm.pipeline.TestRatisPipelineCreateAndDestory
	hadoop.ozone.client.rpc.TestFailureHandlingByClient
	hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException
	hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures
	hadoop.hdds.scm.pipeline.TestPipelineClose
	hadoop.ozone.client.rpc.TestOzoneRpcClient
	hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
	hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient

Subsystem	Report/Notes
Docker	Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/6/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc
uname	Linux 2bc98f91d488 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `6282c02`
Default Java	1.8.0_212
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/6/artifact/out/patch-unit-hadoop-hdds.txt
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/6/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/6/testReport/
Max. process+thread count	4916 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/6/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

bharatviswa504 · 2019-07-19T19:52:30Z

/retest

arp7

+1 thanks for fixing this @bharatviswa504!

hadoop-yetus · 2019-07-19T23:31:53Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	36	Docker mode activated.
		_ Prechecks _
+1	dupname	0	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	23	Maven dependency ordering for branch
+1	mvninstall	484	trunk passed
+1	compile	251	trunk passed
+1	checkstyle	66	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	799	branch has no errors when building and testing our client artifacts.
+1	javadoc	149	trunk passed
0	spotbugs	316	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	503	trunk passed
		_ Patch Compile Tests _
0	mvndep	30	Maven dependency ordering for patch
+1	mvninstall	458	the patch passed
+1	compile	271	the patch passed
+1	cc	271	the patch passed
+1	javac	271	the patch passed
+1	checkstyle	80	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	2	The patch has no ill-formed XML file.
+1	shadedclient	645	patch has no errors when building and testing our client artifacts.
+1	javadoc	157	the patch passed
+1	findbugs	566	the patch passed
		_ Other Tests _
+1	unit	301	hadoop-hdds in the patch passed.
-1	unit	269	hadoop-ozone in the patch failed.
+1	asflicense	50	The patch does not generate ASF License warnings.
		5301

Subsystem	Report/Notes
Docker	Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/7/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc
uname	Linux f434386d5f73 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `7f1b76c`
Default Java	1.8.0_212
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/7/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/7/testReport/
Max. process+thread count	1209 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/7/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

bharatviswa504 · 2019-07-20T15:54:59Z

Rebased with latest trunk.

hadoop-yetus · 2019-07-20T17:54:12Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	38	Docker mode activated.
		_ Prechecks _
+1	dupname	0	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 4 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	71	Maven dependency ordering for branch
+1	mvninstall	489	trunk passed
+1	compile	259	trunk passed
+1	checkstyle	76	trunk passed
+1	mvnsite	1	trunk passed
+1	shadedclient	882	branch has no errors when building and testing our client artifacts.
+1	javadoc	163	trunk passed
0	spotbugs	316	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	512	trunk passed
		_ Patch Compile Tests _
0	mvndep	37	Maven dependency ordering for patch
+1	mvninstall	459	the patch passed
+1	compile	256	the patch passed
+1	cc	256	the patch passed
+1	javac	256	the patch passed
+1	checkstyle	70	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	1	The patch has no ill-formed XML file.
+1	shadedclient	621	patch has no errors when building and testing our client artifacts.
+1	javadoc	163	the patch passed
+1	findbugs	529	the patch passed
		_ Other Tests _
+1	unit	280	hadoop-hdds in the patch passed.
-1	unit	1979	hadoop-ozone in the patch failed.
+1	asflicense	50	The patch does not generate ASF License warnings.
		7116

Reason	Tests
Failed junit tests	hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures
	hadoop.ozone.om.snapshot.TestOzoneManagerSnapshotProvider
	hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient
	hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException
	hadoop.ozone.client.rpc.TestOzoneRpcClient
	hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures
	hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer
	hadoop.ozone.TestStorageContainerManager
	hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
	hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
	hadoop.ozone.client.rpc.TestFailureHandlingByClient
	hadoop.ozone.client.rpc.TestOzoneAtRestEncryption

Subsystem	Report/Notes
Docker	Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-948/8/artifact/out/Dockerfile
GITHUB PR	#948
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml cc
uname	Linux 997c627396b6 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `acdb0a1`
Default Java	1.8.0_212
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/8/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/8/testReport/
Max. process+thread count	5092 (vs. ulimit of 5500)
modules	C: hadoop-hdds/common hadoop-ozone/common hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-948/8/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

bharatviswa504 · 2019-07-22T00:57:34Z

/retest

arp7 · 2019-07-22T19:05:34Z

I am merging this with couple of caveats.

There are numerous integration test failures. However these tests also fail in current trunk, so they are likely unrelated.
More seriously the integration test TestOzoneManagerSnapshotProvider.testDownloadCheckpoint which exercises related functionality failed in the pre-commit run. It passes for me locally with the patch applied. So this is likely a flaky test.

* SAMZA-2124: Add Beam API doc to the website * Address pr feedback

…heckpoint and reload OM state (apache#948)

hanishakoneru added the ozone label Jun 11, 2019

hanishakoneru requested review from arp7 and bharatviswa504 June 11, 2019 21:21

This comment has been minimized.

Sign in to view

arp7 reviewed Jun 13, 2019

View reviewed changes

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java Outdated Show resolved Hide resolved

arp7 reviewed Jun 13, 2019

View reviewed changes

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java Outdated Show resolved Hide resolved

arp7 reviewed Jun 13, 2019

View reviewed changes

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java Outdated Show resolved Hide resolved

arp7 reviewed Jun 13, 2019

View reviewed changes

hanishakoneru force-pushed the HDDS-1649 branch from ad0f09e to f7b98eb Compare July 17, 2019 22:03

arp7 reviewed Jul 18, 2019

View reviewed changes

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java Outdated

Copy link

Contributor

arp7 Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

arp7 reviewed Jul 18, 2019

View reviewed changes

bharatviswa504 reviewed Jul 18, 2019

View reviewed changes

hanishakoneru force-pushed the HDDS-1649 branch from f7b98eb to dd81afc Compare July 18, 2019 22:08

anuengineer reviewed Jul 19, 2019

View reviewed changes

arp7 approved these changes Jul 19, 2019

View reviewed changes

hanishakoneru and others added 7 commits July 20, 2019 08:42

install checkpoint and reload state

de1b66a

Fixes and Unit test

c643aca

Review comments

7191784

unit test fix

2440f46

review comment fix

04c79ee

Disable InstallSnapshot on Ratis so that only a notification is sent.

1bbcf4f

fix acceptance test failure.

f954e14

bharatviswa504 force-pushed the HDDS-1649 branch from ca605d8 to f954e14 Compare July 20, 2019 15:54

arp7 merged commit cdc36fe into apache:trunk Jul 22, 2019

shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019

SAMZA-2124: Add Beam API doc to the website (apache#948)

6711a9f

* SAMZA-2124: Add Beam API doc to the website * Address pr feedback

amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019

HDDS-1649. On installSnapshot notification from OM leader, download c…

43efe3d

…heckpoint and reload OM state (apache#948)

HDDS-1649. On installSnapshot notification from OM leader, download checkpoint and reload OM state #948

HDDS-1649. On installSnapshot notification from OM leader, download checkpoint and reload OM state #948

Uh oh!

Conversation

hanishakoneru commented Jun 11, 2019

Uh oh!

This comment has been minimized.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hadoop-yetus commented Jul 17, 2019

Uh oh!

arp7 commented Jul 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 commented Jul 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bharatviswa504 Jul 18, 2019 •

edited

Loading