Skip to content

Conversation

@tdas
Copy link
Contributor

@tdas tdas commented Apr 8, 2015

Currently if you want to create a StreamingContext from checkpoint information, the system will create a new SparkContext. This prevent StreamingContext to be recreated from checkpoints in managed environments where SparkContext is precreated.

The solution in this PR: Introduce the following methods on StreamingContext

  1. new StreamingContext(checkpointDirectory, sparkContext)
    Recreate StreamingContext from checkpoint using the provided SparkContext
  2. StreamingContext.getOrCreate(checkpointDirectory, sparkContext, createFunction: SparkContext => StreamingContext)
    If checkpoint file exists, then recreate StreamingContext using the provided SparkContext (that is, call 1.), else create StreamingContext using the provided createFunction

TODO: the corresponding Java and Python API has to be added as well.

@tdas tdas changed the title [Spark-6752] [Spark-6752][Streaming] Allow StreamingContext to be recreated from checkpoint and existing SparkContext Apr 8, 2015
@tdas tdas changed the title [Spark-6752][Streaming] Allow StreamingContext to be recreated from checkpoint and existing SparkContext [SPARK-6752][Streaming] Allow StreamingContext to be recreated from checkpoint and existing SparkContext Apr 8, 2015
@SparkQA
Copy link

SparkQA commented Apr 8, 2015

Test build #29898 has started for PR 5428 at commit 36a7823.

@SparkQA
Copy link

SparkQA commented Apr 9, 2015

Test build #29898 has finished for PR 5428 at commit 36a7823.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29898/
Test PASSed.

@tdas
Copy link
Contributor Author

tdas commented Apr 9, 2015

@jerryshao Mind taking a look at this? Its still WIP as unit tests are commented out.

@zzcclp
Copy link
Contributor

zzcclp commented Apr 9, 2015

@tdas , can this RP resolve this issue?
Restart a streaming app from checkpoint incorrectly if using accumulators .

@jerryshao
Copy link
Contributor

@tdas Yeah, will do.

@zzcclp I'm not sure, maybe you can take a try, from my guess, this could possibly work, since accumulator is registered in SparkContext, while SparkContext is still existed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to change to string interpolator style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that we should log a warning message in the case where we ignore the error?

@jerryshao
Copy link
Contributor

It looks good to me. Simply curious about the scenarios of this usage, is there any situation where streaming context is failed but spark context is still existed when driver failure is met?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems unrelated to this fix. I think we can simply do a null check inside this method and create a FileSystem if needed to avoid unnecessary changes to the calls (all the fs being passed in changing to Some(fs)) -- keeps git history sane.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added this is so that we should not have to handle nulls. Dealing with nulls is severely frowned upon in Sclaa, and precisely why Option was introduced. There are many places where this has been done, and slowly I was fix those. I think this is a small enough change (doesnt change functionality, or existing code paths) that is okay to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this file has to change anyways in the attempt to make the semantics of read more clear.

@zzcclp
Copy link
Contributor

zzcclp commented Apr 10, 2015

@jerryshao , thanks, I will test this case later.

@tdas
Copy link
Contributor Author

tdas commented Apr 15, 2015

@zzcclp I dont think it will solve this issue directly. But it may allow the SparkContext to be re-initialized properly before the StreamingContext is recreated from checkpoints.

@tdas
Copy link
Contributor Author

tdas commented Apr 15, 2015

@ALL This is still a WIP. Adding the equivalent Java API requires refactoring the existing JavaStreamingContext.getOrCreate to not use JavaStreamingContextFactory and use o.a.s.java.api.Function0 (which needs to be added). I am going to open other PRs for them before this can be merged.

@SparkQA
Copy link

SparkQA commented Apr 16, 2015

Test build #30389 has started for PR 5428 at commit eabd092.

@tdas
Copy link
Contributor Author

tdas commented Apr 16, 2015

@JoshRosen Please take a quick look at the Function.
@jerryshao @harishreedharan I have updated the patch with Java API and unit tests. I think I am going to create a separate JIRA for python API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these are unrelated to the PR, but just cleans up the formatting of the JavaAPISuite which is quite badly formatted.

@SparkQA
Copy link

SparkQA commented Apr 16, 2015

Test build #30389 has finished for PR 5428 at commit eabd092.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30389/
Test FAILed.

@tdas
Copy link
Contributor Author

tdas commented Apr 16, 2015

Jenkins, test this again.

@tdas
Copy link
Contributor Author

tdas commented Apr 20, 2015

Jenkins, test this.

@SparkQA
Copy link

SparkQA commented Apr 21, 2015

Test build #691 has started for PR 5428 at commit eabd092.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #691 has finished for PR 5428 at commit eabd092.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #692 has started for PR 5428 at commit eabd092.

@zzcclp
Copy link
Contributor

zzcclp commented Apr 22, 2015

@tdas , I tested streaming recovering from checkpoint with this PR, it failed if it use accumulators, so this assuredly can't solve issue SPARK-5206 directly. how to solve issue SPARK-5206?

@tdas
Copy link
Contributor Author

tdas commented Apr 22, 2015

Yes, this is not intended to solve SPARK-5206.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #30743 has started for PR 5428 at commit 524f519.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #30743 has finished for PR 5428 at commit 524f519.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30743/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #30751 has started for PR 5428 at commit 94db63c.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #30751 has finished for PR 5428 at commit 94db63c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30751/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #693 has started for PR 5428 at commit 94db63c.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #693 has finished for PR 5428 at commit 94db63c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 22, 2015

Test build #694 has started for PR 5428 at commit 94db63c.

@JoshRosen
Copy link
Contributor

LGTM pending Jenkins.

@SparkQA
Copy link

SparkQA commented Apr 23, 2015

Test build #695 has started for PR 5428 at commit 94db63c.

@SparkQA
Copy link

SparkQA commented Apr 23, 2015

Test build #695 has finished for PR 5428 at commit 94db63c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 23, 2015

Test build #696 has started for PR 5428 at commit 94db63c.

@SparkQA
Copy link

SparkQA commented Apr 23, 2015

Test build #697 has started for PR 5428 at commit 94db63c.

@SparkQA
Copy link

SparkQA commented Apr 23, 2015

Test build #697 has finished for PR 5428 at commit 94db63c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@tdas
Copy link
Contributor Author

tdas commented Apr 23, 2015

Merging this. Thanks Josh.

@asfgit asfgit closed this in 534f2a4 Apr 23, 2015
@zzcclp
Copy link
Contributor

zzcclp commented Apr 27, 2015

hi, @tdas , why this PR was reverted?

@tdas
Copy link
Contributor Author

tdas commented Apr 29, 2015

This PR was reverted because I had used MutableBoolean which does not seem to work well with Hadoop 1.0.4. I reopened the PR in #5773.

asfgit pushed a commit that referenced this pull request Apr 29, 2015
…eated from checkpoint and existing SparkContext

Original PR #5428 got reverted due to issues between MutableBoolean and Hadoop 1.0.4 (see JIRA). This replaces MutableBoolean with AtomicBoolean.

srowen pwendell

Author: Tathagata Das <[email protected]>

Closes #5773 from tdas/SPARK-6752 and squashes the following commits:

a0c0ead [Tathagata Das] Fix for hadoop 1.0.4
70ae85b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752
94db63c [Tathagata Das] Fix long line.
524f519 [Tathagata Das] Many changes based on PR comments.
eabd092 [Tathagata Das] Added Function0, Java API and unit tests for StreamingContext.getOrCreate
36a7823 [Tathagata Das] Minor changes.
204814e [Tathagata Das] Added StreamingContext.getOrCreate with existing SparkContext
@tdas
Copy link
Contributor Author

tdas commented May 12, 2015

I have revised the implementation of this PR in a followup PR #6096

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 14, 2015
…heckpoint and existing SparkContext

Currently if you want to create a StreamingContext from checkpoint information, the system will create a new SparkContext. This prevent StreamingContext to be recreated from checkpoints in managed environments where SparkContext is precreated.

The solution in this PR: Introduce the following methods on StreamingContext
1. `new StreamingContext(checkpointDirectory, sparkContext)`
   Recreate StreamingContext from checkpoint using the provided SparkContext
2. `StreamingContext.getOrCreate(checkpointDirectory, sparkContext, createFunction: SparkContext => StreamingContext)`
   If checkpoint file exists, then recreate StreamingContext using the provided SparkContext (that is, call 1.), else create StreamingContext using the provided createFunction

TODO: the corresponding Java and Python API has to be added as well.

Author: Tathagata Das <[email protected]>

Closes apache#5428 from tdas/SPARK-6752 and squashes the following commits:

94db63c [Tathagata Das] Fix long line.
524f519 [Tathagata Das] Many changes based on PR comments.
eabd092 [Tathagata Das] Added Function0, Java API and unit tests for StreamingContext.getOrCreate
36a7823 [Tathagata Das] Minor changes.
204814e [Tathagata Das] Added StreamingContext.getOrCreate with existing SparkContext
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…eated from checkpoint and existing SparkContext

Original PR apache#5428 got reverted due to issues between MutableBoolean and Hadoop 1.0.4 (see JIRA). This replaces MutableBoolean with AtomicBoolean.

srowen pwendell

Author: Tathagata Das <[email protected]>

Closes apache#5773 from tdas/SPARK-6752 and squashes the following commits:

a0c0ead [Tathagata Das] Fix for hadoop 1.0.4
70ae85b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752
94db63c [Tathagata Das] Fix long line.
524f519 [Tathagata Das] Many changes based on PR comments.
eabd092 [Tathagata Das] Added Function0, Java API and unit tests for StreamingContext.getOrCreate
36a7823 [Tathagata Das] Minor changes.
204814e [Tathagata Das] Added StreamingContext.getOrCreate with existing SparkContext
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…eated from checkpoint and existing SparkContext

Original PR apache#5428 got reverted due to issues between MutableBoolean and Hadoop 1.0.4 (see JIRA). This replaces MutableBoolean with AtomicBoolean.

srowen pwendell

Author: Tathagata Das <[email protected]>

Closes apache#5773 from tdas/SPARK-6752 and squashes the following commits:

a0c0ead [Tathagata Das] Fix for hadoop 1.0.4
70ae85b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752
94db63c [Tathagata Das] Fix long line.
524f519 [Tathagata Das] Many changes based on PR comments.
eabd092 [Tathagata Das] Added Function0, Java API and unit tests for StreamingContext.getOrCreate
36a7823 [Tathagata Das] Minor changes.
204814e [Tathagata Das] Added StreamingContext.getOrCreate with existing SparkContext
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…heckpoint and existing SparkContext

Currently if you want to create a StreamingContext from checkpoint information, the system will create a new SparkContext. This prevent StreamingContext to be recreated from checkpoints in managed environments where SparkContext is precreated.

The solution in this PR: Introduce the following methods on StreamingContext
1. `new StreamingContext(checkpointDirectory, sparkContext)`
   Recreate StreamingContext from checkpoint using the provided SparkContext
2. `StreamingContext.getOrCreate(checkpointDirectory, sparkContext, createFunction: SparkContext => StreamingContext)`
   If checkpoint file exists, then recreate StreamingContext using the provided SparkContext (that is, call 1.), else create StreamingContext using the provided createFunction

TODO: the corresponding Java and Python API has to be added as well.

Author: Tathagata Das <[email protected]>

Closes apache#5428 from tdas/SPARK-6752 and squashes the following commits:

94db63c [Tathagata Das] Fix long line.
524f519 [Tathagata Das] Many changes based on PR comments.
eabd092 [Tathagata Das] Added Function0, Java API and unit tests for StreamingContext.getOrCreate
36a7823 [Tathagata Das] Minor changes.
204814e [Tathagata Das] Added StreamingContext.getOrCreate with existing SparkContext
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…eated from checkpoint and existing SparkContext

Original PR apache#5428 got reverted due to issues between MutableBoolean and Hadoop 1.0.4 (see JIRA). This replaces MutableBoolean with AtomicBoolean.

srowen pwendell

Author: Tathagata Das <[email protected]>

Closes apache#5773 from tdas/SPARK-6752 and squashes the following commits:

a0c0ead [Tathagata Das] Fix for hadoop 1.0.4
70ae85b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752
94db63c [Tathagata Das] Fix long line.
524f519 [Tathagata Das] Many changes based on PR comments.
eabd092 [Tathagata Das] Added Function0, Java API and unit tests for StreamingContext.getOrCreate
36a7823 [Tathagata Das] Minor changes.
204814e [Tathagata Das] Added StreamingContext.getOrCreate with existing SparkContext
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants