-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-41415][3.3] SASL Request Retries #39644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
### What changes were proposed in this pull request? Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries. ### Why are the changes needed? We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests, and tested on cluster to ensure the retries are being triggered correctly. Closes apache#38959 from akpatnam25/SPARK-41415. Authored-by: Aravind Patnam <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making this, @akpatnam25 .
|
will backport SPARK-42090 once this merged |
|
Can one of the admins verify this patch? |
|
Gentle ping once more, @mridulm ~ |
mridulm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Sorry for the delay !
|
Thanks for the ping @dongjoon-hyun :) |
|
Thank you! |
### What changes were proposed in this pull request? Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries. ### Why are the changes needed? We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests, and tested on cluster to ensure the retries are being triggered correctly. Closes #38959 from akpatnam25/SPARK-41415. Authored-by: Aravind Patnam <apatnamlinkedin.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> Closes #39644 from akpatnam25/SPARK-41415-backport-3.3. Authored-by: Aravind Patnam <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
|
Merged to branch-3.3 |
|
Yes, only |
|
Closing, as PR has been merged. |
What changes were proposed in this pull request?
Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries.
Why are the changes needed?
We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease.
Does this PR introduce any user-facing change? No
How was this patch tested?
Added unit tests, and tested on cluster to ensure the retries are being triggered correctly.
Closes #38959 from akpatnam25/SPARK-41415.
Authored-by: Aravind Patnam [email protected]
Signed-off-by: Mridul Muralidharan <mridulgmail.com>