-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24062][Thrift Server] Fix SASL encryption cannot enabled issue in thrift server #21138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #89773 has finished for PR 21138 at commit
|
|
IMO the fix would be to not do UGI.loginUserFromKeytab in HiveClientImpl; or rather, do it only if it is absolutely necessary. |
|
Hi @mridulm , thanks a lot for your comments. UGI.loginUserFromKeytab is not existed any more in Spark 2.3+ (dc2714d#diff-6fd847124f8eae45ba2de1cf7d6296fe). Actually it is the code here ( Line 53 in e77d62a
PS. I saw two "Login successful" from UGI in the thrift server log, but I can only find out one login in the thrift server code. So I assume another one is in the Hive library. Yes, I agree with you, ideally we should not login from keytab unnecessarily, but thinking of thrift server as a Spark application, it doesn't know the context of Spark's UGI and do login to refresh the UGI in its context, seems we cannot defend user to do that in the user layer. So I think my fix could workaround such issue, though may not be the elegant fix. |
|
@jerryshao As we discussed IRL
|
|
@mridulm I would treat the current fix as a workaround for SASL issue, since it is a regression in 2.3. For UGI refreshing issue (mainly cause STS long running failure, also lead to SASL failure here), I think we can create a separate JIRA to fix the issue. Since this is not a regression. What do you think? |
|
Sounds good to me; any thoughts @vanzin ? (since you changed this last) |
|
I'm fine with the fix. Not familiar with the internals of the STS / Hive to suggest anything different. |
|
Jenkins, retest this please. |
|
Test build #89864 has finished for PR 21138 at commit
|
|
Merging to master and branch 2.3. |
… in thrift server ## What changes were proposed in this pull request? For the details of the exception please see [SPARK-24062](https://issues.apache.org/jira/browse/SPARK-24062). The issue is: Spark on Yarn stores SASL secret in current UGI's credentials, this credentials will be distributed to AM and executors, so that executors and drive share the same secret to communicate. But STS/Hive library code will refresh the current UGI by UGI's loginFromKeytab() after Spark application is started, this will create a new UGI in the current driver's context with empty tokens and secret keys, so secret key is lost in the current context's UGI, that's why Spark driver throws secret key not found exception. In Spark 2.2 code, Spark also stores this secret key in SecurityManager's class variable, so even UGI is refreshed, the secret is still existed in the object, so STS with SASL can still be worked in Spark 2.2. But in Spark 2.3, we always search key from current UGI, which makes it fail to work in Spark 2.3. To fix this issue, there're two possible solutions: 1. Fix in STS/Hive library, when a new UGI is refreshed, copy the secret key from original UGI to the new one. The difficulty is that some codes to refresh the UGI is existed in Hive library, which makes us hard to change the code. 2. Roll back the logics in SecurityManager to match Spark 2.2, so that this issue can be fixed. 2nd solution seems a simple one. So I will propose a PR with 2nd solution. ## How was this patch tested? Verified in local cluster. CC vanzin tgravescs please help to review. Thanks! Author: jerryshao <[email protected]> Closes #21138 from jerryshao/SPARK-24062. (cherry picked from commit ffaf0f9) Signed-off-by: jerryshao <[email protected]>
… in thrift server For the details of the exception please see [SPARK-24062](https://issues.apache.org/jira/browse/SPARK-24062). The issue is: Spark on Yarn stores SASL secret in current UGI's credentials, this credentials will be distributed to AM and executors, so that executors and drive share the same secret to communicate. But STS/Hive library code will refresh the current UGI by UGI's loginFromKeytab() after Spark application is started, this will create a new UGI in the current driver's context with empty tokens and secret keys, so secret key is lost in the current context's UGI, that's why Spark driver throws secret key not found exception. In Spark 2.2 code, Spark also stores this secret key in SecurityManager's class variable, so even UGI is refreshed, the secret is still existed in the object, so STS with SASL can still be worked in Spark 2.2. But in Spark 2.3, we always search key from current UGI, which makes it fail to work in Spark 2.3. To fix this issue, there're two possible solutions: 1. Fix in STS/Hive library, when a new UGI is refreshed, copy the secret key from original UGI to the new one. The difficulty is that some codes to refresh the UGI is existed in Hive library, which makes us hard to change the code. 2. Roll back the logics in SecurityManager to match Spark 2.2, so that this issue can be fixed. 2nd solution seems a simple one. So I will propose a PR with 2nd solution. Verified in local cluster. CC vanzin tgravescs please help to review. Thanks! Author: jerryshao <[email protected]> Closes apache#21138 from jerryshao/SPARK-24062. (cherry picked from commit ffaf0f9) Signed-off-by: jerryshao <[email protected]> Change-Id: I7696fedd013e4c0981753bd4feeffaf4dd45dc7f
… in thrift server (apache#271) ## What changes were proposed in this pull request? For the details of the exception please see [SPARK-24062](https://issues.apache.org/jira/browse/SPARK-24062). The issue is: Spark on Yarn stores SASL secret in current UGI's credentials, this credentials will be distributed to AM and executors, so that executors and drive share the same secret to communicate. But STS/Hive library code will refresh the current UGI by UGI's loginFromKeytab() after Spark application is started, this will create a new UGI in the current driver's context with empty tokens and secret keys, so secret key is lost in the current context's UGI, that's why Spark driver throws secret key not found exception. In Spark 2.2 code, Spark also stores this secret key in SecurityManager's class variable, so even UGI is refreshed, the secret is still existed in the object, so STS with SASL can still be worked in Spark 2.2. But in Spark 2.3, we always search key from current UGI, which makes it fail to work in Spark 2.3. To fix this issue, there're two possible solutions: 1. Fix in STS/Hive library, when a new UGI is refreshed, copy the secret key from original UGI to the new one. The difficulty is that some codes to refresh the UGI is existed in Hive library, which makes us hard to change the code. 2. Roll back the logics in SecurityManager to match Spark 2.2, so that this issue can be fixed. 2nd solution seems a simple one. So I will propose a PR with 2nd solution. ## How was this patch tested? Verified in local cluster. CC vanzin tgravescs please help to review. Thanks! Author: jerryshao <[email protected]> Closes apache#21138 from jerryshao/SPARK-24062. (cherry picked from commit ffaf0f9) Signed-off-by: jerryshao <[email protected]>
What changes were proposed in this pull request?
For the details of the exception please see SPARK-24062.
The issue is:
Spark on Yarn stores SASL secret in current UGI's credentials, this credentials will be distributed to AM and executors, so that executors and drive share the same secret to communicate. But STS/Hive library code will refresh the current UGI by UGI's loginFromKeytab() after Spark application is started, this will create a new UGI in the current driver's context with empty tokens and secret keys, so secret key is lost in the current context's UGI, that's why Spark driver throws secret key not found exception.
In Spark 2.2 code, Spark also stores this secret key in SecurityManager's class variable, so even UGI is refreshed, the secret is still existed in the object, so STS with SASL can still be worked in Spark 2.2. But in Spark 2.3, we always search key from current UGI, which makes it fail to work in Spark 2.3.
To fix this issue, there're two possible solutions:
2nd solution seems a simple one. So I will propose a PR with 2nd solution.
How was this patch tested?
Verified in local cluster.
CC @vanzin @tgravescs please help to review. Thanks!