-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-6752][STREAMING][Revised] Allow StreamingContext to be recreated from checkpoint and existing SparkContext #6096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the whole unit test to test the deleted API has been replaced by a sub-unit-test that tests older API's ability to use an existing SparkContext.
|
Merged build triggered. |
|
Merged build started. |
|
Test build #32535 has started for PR 6096 at commit |
|
LGTM |
|
Test build #32535 has finished for PR 6096 at commit
|
|
Merged build finished. Test PASSed. |
|
Test PASSed. |
Conflicts: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala
|
Merged build triggered. |
|
Merged build started. |
|
@JoshRosen Since you took a look at the original PR #5428, could you take a look at this? |
|
Test build #32550 has started for PR 6096 at commit |
|
Test build #32550 has finished for PR 6096 at commit
|
|
Merged build finished. Test FAILed. |
|
Test FAILed. |
|
test this please. |
|
Merged build triggered. |
|
Merged build started. |
|
Test build #32554 has started for PR 6096 at commit |
|
Test build #32554 has finished for PR 6096 at commit
|
|
Merged build finished. Test FAILed. |
|
Test FAILed. |
|
test this please |
|
Merged build triggered. |
|
Merged build started. |
|
Test build #32556 has started for PR 6096 at commit |
|
Test build #32556 has finished for PR 6096 at commit
|
|
Merged build finished. Test PASSed. |
|
Test PASSed. |
|
This seems fine to me. My only comment would be whether we need to document this behavior anywhere or whether this behavior change might cause confusion for users. I don't think it should cause issues, though, since the old code would have just thrown a confusing "multiple active SparkContexts in the same JVM" message in these scenarios. |
|
Yeah, this would not have been a possible code path before due to that On Wed, May 13, 2015 at 1:43 PM, Josh Rosen [email protected]
|
…ated from checkpoint and existing SparkContext This is a revision of the earlier version (see #5773) that passed the active SparkContext explicitly through a new set of Java and Scala API. The drawbacks are. * Hard to implement in python. * New API introduced. This is even more confusing since we are introducing getActiveOrCreate in SPARK-7553 Furthermore, there is now a direct way get an existing active SparkContext or create a new on - SparkContext.getOrCreate(conf). Its better to use this to get the SparkContext rather than have a new API to explicitly pass the context. So in this PR I have * Removed the new versions of StreamingContext.getOrCreate() which took SparkContext * Added the ability to pick up existing SparkContext when the StreamingContext tries to create a SparkContext. Author: Tathagata Das <[email protected]> Closes #6096 from tdas/SPARK-6752 and squashes the following commits: 53f4b2d [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752 f024b77 [Tathagata Das] Removed extra API and used SparkContext.getOrCreate (cherry picked from commit bce00da) Signed-off-by: Tathagata Das <[email protected]>
…ated from checkpoint and existing SparkContext This is a revision of the earlier version (see apache#5773) that passed the active SparkContext explicitly through a new set of Java and Scala API. The drawbacks are. * Hard to implement in python. * New API introduced. This is even more confusing since we are introducing getActiveOrCreate in SPARK-7553 Furthermore, there is now a direct way get an existing active SparkContext or create a new on - SparkContext.getOrCreate(conf). Its better to use this to get the SparkContext rather than have a new API to explicitly pass the context. So in this PR I have * Removed the new versions of StreamingContext.getOrCreate() which took SparkContext * Added the ability to pick up existing SparkContext when the StreamingContext tries to create a SparkContext. Author: Tathagata Das <[email protected]> Closes apache#6096 from tdas/SPARK-6752 and squashes the following commits: 53f4b2d [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752 f024b77 [Tathagata Das] Removed extra API and used SparkContext.getOrCreate
|
Just curious. How are we supposed to create a |
|
@danielwegener In the creating function that you pass on to getOrCreate, you can do |
…ated from checkpoint and existing SparkContext This is a revision of the earlier version (see apache#5773) that passed the active SparkContext explicitly through a new set of Java and Scala API. The drawbacks are. * Hard to implement in python. * New API introduced. This is even more confusing since we are introducing getActiveOrCreate in SPARK-7553 Furthermore, there is now a direct way get an existing active SparkContext or create a new on - SparkContext.getOrCreate(conf). Its better to use this to get the SparkContext rather than have a new API to explicitly pass the context. So in this PR I have * Removed the new versions of StreamingContext.getOrCreate() which took SparkContext * Added the ability to pick up existing SparkContext when the StreamingContext tries to create a SparkContext. Author: Tathagata Das <[email protected]> Closes apache#6096 from tdas/SPARK-6752 and squashes the following commits: 53f4b2d [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752 f024b77 [Tathagata Das] Removed extra API and used SparkContext.getOrCreate
…ated from checkpoint and existing SparkContext This is a revision of the earlier version (see apache#5773) that passed the active SparkContext explicitly through a new set of Java and Scala API. The drawbacks are. * Hard to implement in python. * New API introduced. This is even more confusing since we are introducing getActiveOrCreate in SPARK-7553 Furthermore, there is now a direct way get an existing active SparkContext or create a new on - SparkContext.getOrCreate(conf). Its better to use this to get the SparkContext rather than have a new API to explicitly pass the context. So in this PR I have * Removed the new versions of StreamingContext.getOrCreate() which took SparkContext * Added the ability to pick up existing SparkContext when the StreamingContext tries to create a SparkContext. Author: Tathagata Das <[email protected]> Closes apache#6096 from tdas/SPARK-6752 and squashes the following commits: 53f4b2d [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into SPARK-6752 f024b77 [Tathagata Das] Removed extra API and used SparkContext.getOrCreate
This is a revision of the earlier version (see #5773) that passed the active SparkContext explicitly through a new set of Java and Scala API. The drawbacks are.
Furthermore, there is now a direct way get an existing active SparkContext or create a new on - SparkContext.getOrCreate(conf). Its better to use this to get the SparkContext rather than have a new API to explicitly pass the context.
So in this PR I have