-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-15419. RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval #2082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
update forks
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
...-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
Outdated
Show resolved
Hide resolved
...-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic starts with this counter at a value and then decreases, it makes sense to keep it as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, not all clients are so smart, they may not configred appropriately. When those clients encounter this problem, they complain about router not retry more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea before was for us to surface the unavailability to the client and then the client retry with the regular client logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, not all clients are so smart, they may not configred appropriately. When those clients encounter this problem, they complain about router not retry more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but we should change the logic to match the old style or at least to not have the two styles together.
31247aa to
94b0399
Compare
|
💔 -1 overall
This message was automatically generated. |
…n cluster is unavailable.
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| String DFS_LEASE_HARDLIMIT_KEY = "dfs.namenode.lease-hard-limit-sec"; | ||
| long DFS_LEASE_HARDLIMIT_DEFAULT = 20 * 60; | ||
|
|
||
| String DFS_ROUTER_RPC_RETRY_INTERVAL_KEY = "dfs.router.rpc.retry.interval.seconds"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically is not seconds but time duration.
| long DFS_LEASE_HARDLIMIT_DEFAULT = 20 * 60; | ||
|
|
||
| String DFS_ROUTER_RPC_RETRY_INTERVAL_KEY = "dfs.router.rpc.retry.interval.seconds"; | ||
| int DFS_ROUTER_RPC_RETRY_INTERVAL_DEFAULT = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it TimeUnit.SECONDS.toXXXX(10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but we should change the logic to match the old style or at least to not have the two styles together.
NOTICE
Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. HADOOP-XXXXX. Fix a typo in YYY.)
For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute