-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Kerberos] Add troubleshooting documentation #32803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds troubleshooting section for Kerberos. Most of the times the problems seen are caused due to invalid configurations like keytab missing principals or credentials not up to date. Time synchronization is important part for Kerberos infrastructure and the time skew can cause problems. To debug further documentation explains how to enable JAAS Kerberos login module debugging and Kerberos/SPNEGO debugging by setting JVM system propertoes.
|
Pinging @elastic/es-security |
|
Pinging @elastic/es-docs |
|
|
||
| * User authentication fails due to either GSS negotiation failure | ||
| or a service login failure on the server side or in the {es} HTTP client. Some of | ||
| the common exceptions are listed below with some tips to resolve them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding the first sentence correctly, I think it should be changed to something like this: "User authentication fails due to either a GSS negotiation failure or a service login failure (on the server or the {es} HTTP client)".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. Thank you.
|
|
||
| *Resolution:* | ||
|
|
||
| `Failure unspecified at GSS-API level (Mechanism level: Checksum failed)`:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these error messages in specific logs or on specific machines in the deployment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated with some more information as it can log message on the server side and an error message on the client. Thank you.
| -- | ||
|
|
||
| When this occurs on HTTP client side, it may be related to an incorrect password. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the subsequent paragraphs related to when this occurs on the server side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have separated this into client and server and yes the subsequent paragraphs are related to server-side except the last one which talks about hostname resolution which is common to both. Thank you.
| + | ||
| -- | ||
|
|
||
| To prevent replay attacks, Kerberos V5 sets maximum tolerance for computer clock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sets a maximum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, Thank you.
| -- | ||
|
|
||
| To prevent replay attacks, Kerberos V5 sets maximum tolerance for computer clock | ||
| synchronization and it is usually set to 5 minutes. Check whether the time on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is typically 5 minutes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, Thank you.
|
|
||
| For detailed information, see {ref}/security-settings.html#ref-kerberos-settings[Kerberos realm settings]. | ||
|
|
||
| To enable Kerberos logging on JVM, add following JVM system properties: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs some context. Like when and why would you do this. How does it differ from the login module debug log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some context about what does this additional logging provide as in GSS context negotiation logs and Kerberos exchange messages. Thank you.
|
|
||
| Kerberos depends on proper hostname resolution, so please check your DNS infrastructure. | ||
| Incorrect DNS setup, DNS SRV records or configuration for KDC servers in `krb5.conf` | ||
| +can cause problems with hostname resolution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this + intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, Removed it. Thanks.
|
|
||
| As Kerberos logs are often cryptic in nature and many things can go wrong | ||
| as it depends on external services like DNS, time synchronization. You might | ||
| have to enable additional debug logs to root cause the issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "root cause" is a verb 😈
Maybe "... debug logs to determine the root cause of the issue."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Tim. Addressed this.
tvernum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
jaymode
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some feedback. After those are addressed, LGTM
| *Symptoms:* | ||
|
|
||
| * User authentication fails due to either GSS negotiation failure | ||
| or a service login failure ( either on the server or in the {es} http client). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit remove space between ( and e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, addressed this.
|
|
||
| * User authentication fails due to either GSS negotiation failure | ||
| or a service login failure ( either on the server or in the {es} http client). | ||
| Some of the common exceptions are listed below with some tips to resolve them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/some tips to resolve/tips to help resolve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed, thank you.
| -- | ||
|
|
||
| As Kerberos logs are often cryptic in nature and many things can go wrong | ||
| as it depends on external services like DNS, time synchronization. You might |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/DNS, time synchronization/DNS and NTP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified this, Thank you.
| as it depends on external services like DNS, time synchronization. You might | ||
| have to enable additional debug logs to determine the root cause of the issue. | ||
|
|
||
| {es} uses JAAS (Java Authentication and Authorization Service) Kerberos login |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/uses JAAS/uses a JAAS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified this, Thank you
| For detailed information, see {ref}/security-settings.html#ref-kerberos-settings[Kerberos realm settings]. | ||
|
|
||
| Sometimes you may need to go deeper to understand the problem during SPNEGO | ||
| gss context negotiation or look at the Kerberos message exchange. To enable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/gss/GSS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified this, Thank you
This commit adds troubleshooting section for Kerberos. Most of the times the problems seen are caused due to invalid configurations like keytab missing principals or credentials not up to date. Time synchronization is an important part for Kerberos infrastructure and the time skew can cause problems. To debug further documentation explains how to enable JAAS Kerberos login module debugging and Kerberos/SPNEGO debugging by setting JVM system properties.
This commit adds troubleshooting section for Kerberos. Most of the times the problems seen are caused due to invalid configurations like keytab missing principals or credentials not up to date. Time synchronization is an important part for Kerberos infrastructure and the time skew can cause problems. To debug further documentation explains how to enable JAAS Kerberos login module debugging and Kerberos/SPNEGO debugging by setting JVM system properties.
This commit adds troubleshooting section for Kerberos.
Most of the times the problems seen are caused due to invalid
configurations like keytab missing principals or credentials
not up to date. Time synchronization is an important part for
Kerberos infrastructure and the time skew can cause problems.
To debug further documentation explains how to enable JAAS
Kerberos login module debugging and Kerberos/SPNEGO debugging
by setting JVM system properties.