-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-2412] CoalescedRDD throws exception with certain pref locs #1337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If the first pass of CoalescedRDD does not find the target number of locations AND the second pass finds new locations, an exception is thrown, as "groupHash.get(nxt_replica).get" is not valid. The fix is just to add an ArrayBuffer to groupHash for that replica if it didn't already exist.
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOrElseUpdate?
|
Jenkins, test this please (testing something). |
|
QA tests have started for PR 1337. This patch merges cleanly. |
|
By the way, I'd like to point out that this is the 1337 PR. Naturally. |
|
QA tests have started for PR 1337. This patch merges cleanly. |
|
QA results for PR 1337: |
|
All automated tests passed. |
|
QA results for PR 1337: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a stylistic change or does this operator somehow have different semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strictly stylistic -- it made more sense when I was using put below, now there's no reason for it
|
LGTM pending one small question |
|
okay LGTM |
|
Okay I merged this. |
If the first pass of CoalescedRDD does not find the target number of locations AND the second pass finds new locations, an exception is thrown, as "groupHash.get(nxt_replica).get" is not valid. The fix is just to add an ArrayBuffer to groupHash for that replica if it didn't already exist. Author: Aaron Davidson <[email protected]> Closes #1337 from aarondav/2412 and squashes the following commits: f587b5d [Aaron Davidson] getOrElseUpdate 3ad8a3c [Aaron Davidson] [SPARK-2412] CoalescedRDD throws exception with certain pref locs (cherry picked from commit 7c23c0d) Signed-off-by: Patrick Wendell <[email protected]>
If the first pass of CoalescedRDD does not find the target number of locations AND the second pass finds new locations, an exception is thrown, as "groupHash.get(nxt_replica).get" is not valid. The fix is just to add an ArrayBuffer to groupHash for that replica if it didn't already exist. Author: Aaron Davidson <[email protected]> Closes apache#1337 from aarondav/2412 and squashes the following commits: f587b5d [Aaron Davidson] getOrElseUpdate 3ad8a3c [Aaron Davidson] [SPARK-2412] CoalescedRDD throws exception with certain pref locs (cherry picked from commit 7c23c0d) Signed-off-by: Patrick Wendell <[email protected]>
If the first pass of CoalescedRDD does not find the target number of locations AND the second pass finds new locations, an exception is thrown, as "groupHash.get(nxt_replica).get" is not valid. The fix is just to add an ArrayBuffer to groupHash for that replica if it didn't already exist. Author: Aaron Davidson <[email protected]> Closes apache#1337 from aarondav/2412 and squashes the following commits: f587b5d [Aaron Davidson] getOrElseUpdate 3ad8a3c [Aaron Davidson] [SPARK-2412] CoalescedRDD throws exception with certain pref locs
…ache#1337) ### What changes were proposed in this pull request? This is for rdar://88338827 (Backport SPARK-38047 Add `OUTLIER_NO_FALLBACK` executor roll policy). This PR aims to add a new executor roll policy which allows users to skip rolling in cases where there are no outlier executors. ### Why are the changes needed? As currently implemented an executor is always rolled every `spark.kubernetes.executor.rollInterval` interval. In environments where starting of executors can introduce latencies it may be desirable for users to have the option to determine if rolling should only happen when outliers are found. ### Does this PR introduce _any_ user-facing change? No, this is an additional option being added to a new feature in Apache Spark 3.3. ### How was this patch tested? Pass the CIs with the newly added test cases.
If the first pass of CoalescedRDD does not find the target number of locations AND the second pass finds new locations, an exception is thrown, as "groupHash.get(nxt_replica).get" is not valid.
The fix is just to add an ArrayBuffer to groupHash for that replica if it didn't already exist.