Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ public void onMessage(PipelineActionsFromDatanode report,
pipelineID = PipelineID.
getFromProtobuf(action.getClosePipeline().getPipelineID());
Pipeline pipeline = pipelineManager.getPipeline(pipelineID);
LOG.info("Received pipeline action {} for {} from datanode [}",
LOG.info("Received pipeline action {} for {} from datanode {}",
action.getAction(), pipeline, report.getDatanodeDetails());
pipelineManager.finalizeAndDestroyPipeline(pipeline, true);
} catch (IOException ioe) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,9 @@ public List<SCMCommand> dispatch(SCMHeartbeatRequestProto heartbeat) {
commands = nodeManager.getCommandQueue(dnID);

} else {
// Get the datanode details again from node manager with the topology info
// for registered datanodes.
datanodeDetails = nodeManager.getNode(datanodeDetails.getIpAddress());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Xiaoyu, node can use Ipaddress or hostname as topology network name.
Maybe we should refactor nodeManager.getNode function, pass datanodeDetails in. Then make whether use Ipaddress or hostname as network topology name an inner logic in the getNode function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not rely on the IP address of datanode in NetworkTopology path, instead we should use the datanode UUID. It is possible that more than one datanode process is running on the same machine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than one DN instances on the same machine are most likely from test/dev environment such as MiniOzoneCluster. In production, even containers in K8S has dedicate IPs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But IP address can change for the same datanode. In fact, we have a Jira to remove it in the future from the yaml file: HDDS-1480

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property "dfs.datanode.use.datanode.hostname" is used to control whether use IP address or hostname. Use Ip address or hostname, current exiting hadoop/hdfs/yarn topology tools/customer mgt scripts can be reused. It would be easy for user to adopt Ozone. @xiaoyuyao, I can take over this if you are fully occupied.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than one DN instances on the same machine are most likely from test/dev environment such as MiniOzoneCluster. In production, even containers in K8S has dedicate IPs.

I agree, but the problem here is that after this change the test/dev environment where there are more than one datanode process running in same machine will not even work properly. Heartbeat from different datanode process (running on same machine) will be mapped to a single process and all the other datanode process will be marked as dead even though they are heartbeating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nandakumar131, yes. We will need to handle this case for the minicluster based tests.
The current topology awareness is based on a map of ip/dns->location, I think change it to uuid->location should work as long we have a mapping from uuid->ip/dns maintained.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on changing it to uuid -> location and maintaining a map for uuid -> ip/dns.


// should we dispatch heartbeat through eventPublisher?
commands = nodeManager.processHeartbeat(datanodeDetails);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,13 @@ public void chooseNodeWithNoExcludedNodes() throws SCMException {
datanodeDetails.get(2)));
Assert.assertFalse(cluster.isSameParent(datanodeDetails.get(1),
datanodeDetails.get(2)));
Assert.assertFalse(cluster.isSameParent(datanodeDetails.get(0),
datanodeDetails.get(3)));
Assert.assertFalse(cluster.isSameParent(datanodeDetails.get(2),
datanodeDetails.get(3)));

// TODO: the following does not have guarantee due to fallback.
// This will need further change in placement algorithm.
//Assert.assertFalse(cluster.isSameParent(datanodeDetails.get(0),
// datanodeDetails.get(3)));
//Assert.assertFalse(cluster.isSameParent(datanodeDetails.get(2),
// datanodeDetails.get(3)));
}

@Test
Expand Down