-
Couldn't load subscription status.
- Fork 9.1k
HDFS-15960 RBF: Router should talk to namenode with security context. #2887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -24,6 +24,7 @@ | |
| import java.net.InetAddress; | ||
| import java.net.InetSocketAddress; | ||
| import java.net.URI; | ||
| import java.security.PrivilegedExceptionAction; | ||
| import java.util.Map; | ||
|
|
||
| import org.apache.hadoop.conf.Configuration; | ||
|
|
@@ -41,6 +42,7 @@ | |
| import org.apache.hadoop.hdfs.tools.DFSHAAdmin; | ||
| import org.apache.hadoop.hdfs.tools.NNHAServiceTarget; | ||
| import org.apache.hadoop.hdfs.web.URLConnectionFactory; | ||
| import org.apache.hadoop.security.SecurityUtil; | ||
| import org.codehaus.jettison.json.JSONArray; | ||
| import org.codehaus.jettison.json.JSONException; | ||
| import org.codehaus.jettison.json.JSONObject; | ||
|
|
@@ -170,7 +172,20 @@ protected void serviceInit(Configuration configuration) throws Exception { | |
|
|
||
| @Override | ||
| public void periodicInvoke() { | ||
| updateState(); | ||
| try { | ||
| SecurityUtil.doAsCurrentUser( | ||
| new PrivilegedExceptionAction<Object>() { | ||
| @Override | ||
| public Object run() { | ||
| updateState(); | ||
| return null; | ||
| } | ||
| }); | ||
| } catch (IOException e) { | ||
| // Generic error that we don't know about | ||
| LOG.error("Unexpected exception while communicating with {}: {}", | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we have a unit test for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, will try to create one. Thanks for checking this out @goiri , somehow I missed the notification re your comments, Will follow up with a unit test soon. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @goiri , following up this. I was able to create a unit test that reproduces the problem and demonstrate that the patch fixes it. However, there is a challenge. The failure is when the router calls the JMX endpoint which returns some info stats in addition to the basic alive status which is obtained in a separate RPC call. The failure is soft - logs the exception and continues, without the information it tried to obtain. However that information is needed later during load balancing, which is how the original bug was discovered. Now, because the main interface capturing knowledge about a NN on the router side (FederationNamenodeContext) does not contain these stats, there is no way to write a unit test against it. There are some unit tests in that area that mock this interface and I modified the mock to include stats, but then I have to downcast to the mock object in the test which is very ugly. So the options are: (1) accept this ugly downcast (2) don't write the test and eventually if Hadoop has an integration test suite, cover the use case there and (3) modify the FederationNamenodeContext to include the stats (see MembershipState and MembershipStats class). My vote would be for (3) as those stats seem essential to the operation of a federated cluster. It would be ok not to make all of the numbers part of the public interface, but the fact that we need stats about resource utilization should be part of the interface. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #3 sounds reasonable, do you mind giving it a try in this PR? |
||
| getNamenodeDesc(), e.getMessage(), e); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a lambda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, agreed that'll be more readable here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goiri and @bolerio for your comments. I am just concern if is it necessary to do as current user here because Router has login when start daemon and it is already execute with current login user. Do you meet some exception here? Thanks.