-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SQL] SPARK-1571 Mistake in java example code #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
Contributor
|
Ok merged. Thanks! |
asfgit
pushed a commit
that referenced
this pull request
Apr 23, 2014
Author: Michael Armbrust <[email protected]> Closes #496 from marmbrus/javaBeanBug and squashes the following commits: 644fedd [Michael Armbrust] Bean methods must be public. (cherry picked from commit 39f85e0) Signed-off-by: Reynold Xin <[email protected]>
jhartlaub
pushed a commit
to jhartlaub/spark
that referenced
this pull request
May 27, 2014
Fix bug in worker clean-up in UI Introduced in d5a96fe (/cc @aarondav). This should be picked into 0.8 and 0.9 as well. The bug causes old (zombie) workers on a node to not disappear immediately from the UI when a new one registers. (cherry picked from commit a1cd185) Signed-off-by: Patrick Wendell <[email protected]>
pdeyhim
pushed a commit
to pdeyhim/spark-1
that referenced
this pull request
Jun 25, 2014
Author: Michael Armbrust <[email protected]> Closes apache#496 from marmbrus/javaBeanBug and squashes the following commits: 644fedd [Michael Armbrust] Bean methods must be public.
andrewor14
pushed a commit
to andrewor14/spark
that referenced
this pull request
Jan 8, 2015
Fix bug in worker clean-up in UI Introduced in d5a96fe (/cc @aarondav). This should be picked into 0.8 and 0.9 as well. The bug causes old (zombie) workers on a node to not disappear immediately from the UI when a new one registers. (cherry picked from commit a1cd185) Signed-off-by: Patrick Wendell <[email protected]>
yifeih
added a commit
to yifeih/spark
that referenced
this pull request
Feb 25, 2019
bzhaoopenstack
pushed a commit
to bzhaoopenstack/spark
that referenced
this pull request
Sep 11, 2019
Bazel is now packed in base images for CNCF project tests now, and the tests are successfully running, we can get rid of the related role for installing it to make the repo clean and easier to maintain. If the current way of doing the tests is not good enough, we can come up with other ways like build bazel in disk-image-builder or try to get an official bazel source in Ubuntu deb to make the whole process simpler.
arjunshroff
pushed a commit
to arjunshroff/spark
that referenced
this pull request
Nov 24, 2020
RolatZhang
pushed a commit
to RolatZhang/spark
that referenced
this pull request
Aug 15, 2022
…pache#496) * [SPARK-34980][SQL] Support coalesce partition through union in AQE ### What changes were proposed in this pull request? - Split plan into several groups, and every child of union is a new group - Coalesce paritition for every group ### Why are the changes needed? #### First Issue The rule `CoalesceShufflePartitions` can only coalesce paritition if * leaf node is ShuffleQueryStage * all shuffle have same partition number With `Union`, it might break the assumption. Let's say we have such plan ``` Union HashAggregate ShuffleQueryStage FileScan ``` `CoalesceShufflePartitions` can not optimize it and the result partition would be `shuffle partition + FileScan partition` which can be quite lagre. It's better to support partial optimize with `Union`. #### Second Issue the coalesce partition formule used the **sum value** as the total input size and it's not friendly for union, see ``` // ShufflePartitionsUtil.coalescePartitions val totalPostShuffleInputSize = mapOutputStatistics.flatMap(_.map(_.bytesByPartitionId.sum)).sum ``` So for such case: ``` Union HashAggregate ShuffleQueryStage HashAggregate ShuffleQueryStage ``` The `CoalesceShufflePartitions` rule will return an unexpected partition number. ### Does this PR introduce _any_ user-facing change? Probably yes, the result partition might changed. ### How was this patch tested? Add test. Closes apache#32084 from ulysses-you/SPARK-34980. Lead-authored-by: ulysses-you <[email protected]> Co-authored-by: ulysses <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 0e23bd7) * Remove unused import Co-authored-by: ulysses-you <[email protected]>
RolatZhang
pushed a commit
to RolatZhang/spark
that referenced
this pull request
Aug 15, 2022
…pache#496) * [SPARK-34980][SQL] Support coalesce partition through union in AQE ### What changes were proposed in this pull request? - Split plan into several groups, and every child of union is a new group - Coalesce paritition for every group ### Why are the changes needed? #### First Issue The rule `CoalesceShufflePartitions` can only coalesce paritition if * leaf node is ShuffleQueryStage * all shuffle have same partition number With `Union`, it might break the assumption. Let's say we have such plan ``` Union HashAggregate ShuffleQueryStage FileScan ``` `CoalesceShufflePartitions` can not optimize it and the result partition would be `shuffle partition + FileScan partition` which can be quite lagre. It's better to support partial optimize with `Union`. #### Second Issue the coalesce partition formule used the **sum value** as the total input size and it's not friendly for union, see ``` // ShufflePartitionsUtil.coalescePartitions val totalPostShuffleInputSize = mapOutputStatistics.flatMap(_.map(_.bytesByPartitionId.sum)).sum ``` So for such case: ``` Union HashAggregate ShuffleQueryStage HashAggregate ShuffleQueryStage ``` The `CoalesceShufflePartitions` rule will return an unexpected partition number. ### Does this PR introduce _any_ user-facing change? Probably yes, the result partition might changed. ### How was this patch tested? Add test. Closes apache#32084 from ulysses-you/SPARK-34980. Lead-authored-by: ulysses-you <[email protected]> Co-authored-by: ulysses <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 0e23bd7) * Remove unused import Co-authored-by: ulysses-you <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.