Yarn: do not set local IP in remote process environment. #394

vanzin · 2014-04-11T20:15:13Z

Without this, remote processes won't launch when you set SPARK_LOCAL_IP for your driver.

AmplabJenkins · 2014-04-11T20:18:12Z

Can one of the admins verify this patch?

mridulm · 2014-04-11T23:08:10Z

Jenkins, test this please.

mateiz · 2014-04-13T00:52:53Z

Jenkins, test this please

mateiz · 2014-04-13T00:52:57Z

Seems like a good catch.

AmplabJenkins · 2014-04-13T00:53:14Z

Merged build triggered.

AmplabJenkins · 2014-04-13T00:53:24Z

Merged build started.

AmplabJenkins · 2014-04-13T00:54:18Z

Merged build finished.

AmplabJenkins · 2014-04-13T00:54:18Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14083/

pwendell · 2014-04-13T03:18:21Z

Jenkins, test this please.

AmplabJenkins · 2014-04-13T03:23:11Z

Merged build triggered.

AmplabJenkins · 2014-04-13T03:23:20Z

Merged build started.

AmplabJenkins · 2014-04-13T04:49:24Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-13T04:49:25Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14087/

vanzin · 2014-04-22T17:40:32Z

Latest master doesn't seem to propagate all the SPARK_* env vars, so this bus is not present anymore.

Better error handling in Spark Streaming and more API cleanup Earlier errors in jobs generated by Spark Streaming (or in the generation of jobs) could not be caught from the main driver thread (i.e. the thread that called StreamingContext.start()) as it would be thrown in different threads. With this change, after `ssc.start`, one can call `ssc.awaitTermination()` which will be block until the ssc is closed, or there is an exception. This makes it easier to debug. This change also adds ssc.stop(<stop-spark-context>) where you can stop StreamingContext without stopping the SparkContext. Also fixes the bug that came up with PRs apache#393 and apache#381. MetadataCleaner default value has been changed from 3500 to -1 for normal SparkContext and 3600 when creating a StreamingContext. Also, updated StreamingListenerBus with changes similar to SparkListenerBus in apache#392. And changed a lot of protected[streaming] to private[streaming].

Modify the jobs-checker job because the newest ansible-lint

…che#394)

…inished event (apache#394) ### What changes were proposed in this pull request? We found a race condition between lastTaskRunningTime and lastShuffleMigrationTime that could lead to a decommissioned executor exit before all the shuffle blocks have been discovered. The issue could lead to immediate task retry right after an executor exit, thus longer query execution time. To fix the issue, we choose to update the lastTaskRunningTime only when a task updates its status to finished through the StatusUpdate event. This is better than the current approach (which use a thread to check for number of running tasks every second), because in this way we clearly know whether the shuffle block refresh happened after all tasks finished running or not, thus resolved the race condition mentioned above. ### Why are the changes needed? To fix a race condition that could lead to shuffle data lost, thus longer query execution time. ### How was this patch tested? This is a very subtle race condition that is hard to write a unit test using current unit test framework. And we are confident the change is low risk. Thus only verify by passing all the existing tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#44090 from jiangxb1987/SPARK-46182. Authored-by: Xingbo Jiang <[email protected]> (cherry picked from commit 6f112f7) Signed-off-by: Dongjoon Hyun <[email protected]> Co-authored-by: Xingbo Jiang <[email protected]>

Yarn: do not set local IP in remote process environment.

898ea89

vanzin closed this Apr 22, 2014

vanzin deleted the yarn-env branch April 24, 2014 21:42

mccheah pushed a commit to mccheah/spark that referenced this pull request Nov 28, 2018

FilterPushdownBenchmark SparkSession should be lazy (apache#394)

d68b3af

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#394 from liu-sheng/145

ce7b08f

Modify the jobs-checker job because the newest ansible-lint

RolatZhang pushed a commit to RolatZhang/spark that referenced this pull request Mar 18, 2022

revert KE-34130 delete log4j-over-sl4j and update version to r50 (apa…

f15af10

…che#394)

Yarn: do not set local IP in remote process environment. #394

Yarn: do not set local IP in remote process environment. #394

Uh oh!

Conversation

vanzin commented Apr 11, 2014

Uh oh!

AmplabJenkins commented Apr 11, 2014

Uh oh!

mridulm commented Apr 11, 2014

Uh oh!

mateiz commented Apr 13, 2014

Uh oh!

mateiz commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

pwendell commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

AmplabJenkins commented Apr 13, 2014

Uh oh!

vanzin commented Apr 22, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants