Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Apr 11, 2014

Without this, remote processes won't launch when you set SPARK_LOCAL_IP for your driver.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@mridulm
Copy link
Contributor

mridulm commented Apr 11, 2014

Jenkins, test this please.

@mateiz
Copy link
Contributor

mateiz commented Apr 13, 2014

Jenkins, test this please

@mateiz
Copy link
Contributor

mateiz commented Apr 13, 2014

Seems like a good catch.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14083/

@pwendell
Copy link
Contributor

Jenkins, test this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14087/

@vanzin
Copy link
Contributor Author

vanzin commented Apr 22, 2014

Latest master doesn't seem to propagate all the SPARK_* env vars, so this bus is not present anymore.

@vanzin vanzin closed this Apr 22, 2014
@vanzin vanzin deleted the yarn-env branch April 24, 2014 21:42
pwendell added a commit to pwendell/spark that referenced this pull request May 12, 2014
Better error handling in Spark Streaming and more API cleanup

Earlier errors in jobs generated by Spark Streaming (or in the generation of jobs) could not be caught from the main driver thread (i.e. the thread that called StreamingContext.start()) as it would be thrown in different threads. With this change, after `ssc.start`, one can call `ssc.awaitTermination()` which will be block until the ssc is closed, or there is an exception. This makes it easier to debug.

This change also adds ssc.stop(<stop-spark-context>) where you can stop StreamingContext without stopping the SparkContext.

Also fixes the bug that came up with PRs apache#393 and apache#381. MetadataCleaner default value has been changed from 3500 to -1 for normal SparkContext and 3600 when creating a StreamingContext. Also, updated StreamingListenerBus with changes similar to SparkListenerBus in apache#392.

And changed a lot of protected[streaming] to private[streaming].
mccheah pushed a commit to mccheah/spark that referenced this pull request Nov 28, 2018
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Modify the jobs-checker job because the newest ansible-lint
RolatZhang pushed a commit to RolatZhang/spark that referenced this pull request Mar 18, 2022
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…inished event (apache#394)

### What changes were proposed in this pull request?

We found a race condition between lastTaskRunningTime and lastShuffleMigrationTime that could lead to a decommissioned executor exit before all the shuffle blocks have been discovered. The issue could lead to immediate task retry right after an executor exit, thus longer query execution time.

To fix the issue, we choose to update the lastTaskRunningTime only when a task updates its status to finished through the StatusUpdate event. This is better than the current approach (which use a thread to check for number of running tasks every second), because in this way we clearly know whether the shuffle block refresh happened after all tasks finished running or not, thus resolved the race condition mentioned above.

### Why are the changes needed?

To fix a race condition that could lead to shuffle data lost, thus longer query execution time.

### How was this patch tested?

This is a very subtle race condition that is hard to write a unit test using current unit test framework. And we are confident the change is low risk. Thus only verify by passing all the existing tests.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#44090 from jiangxb1987/SPARK-46182.

Authored-by: Xingbo Jiang <[email protected]>

(cherry picked from commit 6f112f7)

Signed-off-by: Dongjoon Hyun <[email protected]>
Co-authored-by: Xingbo Jiang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants