Skip to content

Conversation

@aarondav
Copy link
Contributor

This causes an unrecoverable error for applications that are running for longer
than 7 days that have jars added to the SparkContext, as the jars are cleaned up
even though the application is still running.

This causes an unrecoverable error for applications that are running for longer
than 7 days that have jars added to the SparkContext, as the jars are cleaned up
even though the application is still running.
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@pwendell
Copy link
Contributor

Sure - might be good to have it off by default. /cc @velvia.

@pwendell
Copy link
Contributor

@aarondav what about just not cleaning up the data if the app is still running? In the future we should probably assess the TTL based on the finish time of the app, not the start time.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15035/

@aarondav
Copy link
Contributor Author

This patch is intended as a hotfix in the hopes that it can make it into the 1.0 release. Avoiding cleaning up running applications seems like the better solution in general, but is out of scope of this PR.

@pwendell
Copy link
Contributor

I agree, okay let's just pull in this fix and we can hopefully patch the bigger issue later.

@andrewor14
Copy link
Contributor

LGTM

@pwendell
Copy link
Contributor

btw - regarding the branch name - I don't think this was too shitty of a default. I'd actually like to have this on by default if we can get it into working order, because otherwise users will only find out once it's too late that they are out of disk space :P

@asfgit asfgit closed this in bb98eca May 16, 2014
asfgit pushed a commit that referenced this pull request May 16, 2014
This causes an unrecoverable error for applications that are running for longer
than 7 days that have jars added to the SparkContext, as the jars are cleaned up
even though the application is still running.

Author: Aaron Davidson <[email protected]>

Closes #800 from aarondav/shitty-defaults and squashes the following commits:

a573fbb [Aaron Davidson] SPARK-1860: Do not cleanup application work/ directories by default
(cherry picked from commit bb98eca)

Signed-off-by: Patrick Wendell <[email protected]>
@velvia
Copy link
Contributor

velvia commented May 16, 2014

@aarondav @pwendell I agree we don't want to clean up currently running apps, but also that this should be default to on when its fixed. Maybe its as simple as checking last modified time of the directory.

@aarondav
Copy link
Contributor Author

Just a little miffed because it took some time to figure out why our executors suddenly started failing with jar-not-found errors :)

I'd prefer a full solution; last modified time runs into an issue if the executor lies dormant for a week. You might say, "that's unlikely", but I'd say, "it'll happen to someone, and they'll be a little miffed." The worker should have enough state to figure out which executors are currently active, though I'm not sure if the problem is made more difficult by multi-worker scenarios.

@aarondav aarondav deleted the shitty-defaults branch May 16, 2014 16:59
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
This causes an unrecoverable error for applications that are running for longer
than 7 days that have jars added to the SparkContext, as the jars are cleaned up
even though the application is still running.

Author: Aaron Davidson <[email protected]>

Closes apache#800 from aarondav/shitty-defaults and squashes the following commits:

a573fbb [Aaron Davidson] SPARK-1860: Do not cleanup application work/ directories by default
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…se message (apache#800)

[HADP-58798] Add utility functions to facilitate logging spark diagnose message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants