Skip to content

Conversation

@holdenk
Copy link
Contributor

@holdenk holdenk commented Sep 25, 2015

While this is likely not a huge issue for real production systems, for test systems which may setup a Spark Context and tear it down and stand up a Spark Context with a different master (e.g. some local mode & some yarn mode) tests this cane be an issue. Discovered during work on spark-testing-base on Spark 1.4.1, but seems like the logic that triggers it is present in master (see SparkHadoopUtil object). A valid work around for users encountering this issue is to fork a different JVM, however this can be heavy weight.

[info] SampleMiniClusterTest:
[info] Exception encountered when attempting to run a suite with class name: com.holdenkarau.spark.testing.SampleMiniClusterTest *** ABORTED ***
[info] java.lang.ClassCastException: org.apache.spark.deploy.SparkHadoopUtil cannot be cast to org.apache.spark.deploy.yarn.YarnSparkHadoopUtil
[info] at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.get(YarnSparkHadoopUtil.scala:163)
[info] at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:257)
[info] at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:561)
[info] at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:115)
[info] at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
[info] at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
[info] at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
[info] at com.holdenkarau.spark.testing.SharedMiniCluster$class.setup(SharedMiniCluster.scala:186)
[info] at com.holdenkarau.spark.testing.SampleMiniClusterTest.setup(SampleMiniClusterTest.scala:26)
[info] at com.holdenkarau.spark.testing.SharedMiniCluster$class.beforeAll(SharedMiniCluster.scala:103)

@SparkQA
Copy link

SparkQA commented Sep 25, 2015

Test build #42993 has finished for PR 8911 at commit 1915e7d.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only trivial things to suggest like a space after 'try'

@SparkQA
Copy link

SparkQA commented Sep 25, 2015

Test build #43031 has started for PR 8911 at commit f97ec06.

@vanzin
Copy link
Contributor

vanzin commented Sep 25, 2015

There's code in different places that set SPARK_YARN_MODE, but there's no code to unset it. So, to follow your example, if you start a context with yarn-client and later start another one in standalone mode, the latter will use the YARN version of the utils class.

Other than that, looks sane. This is another piece of code that will need some serious thought when trying to fix the "Spark doesn't allow multiple contexts" issue, though.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 25, 2015

@vanzin so in my own code (where I do try and switch between yarn and non yarn mode) I clear the SPARK_YARN_MODE as done in the test.

I could update SparkContext to explicitly clear SPARK_YARN_MODE if its being launched with a non yarn mode client if you think that would be helpful for people?

@vanzin
Copy link
Contributor

vanzin commented Sep 25, 2015

Yeah, having the Spark code clean up after itself is easier because it means people don't have to remember to do it, and it doesn't need to be documented.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 25, 2015

Makes sense, do you think I should put that change in the SparkContext (on startup of non-yarn client or stop of any client) or in the yarnclient stop code?

@vanzin
Copy link
Contributor

vanzin commented Sep 25, 2015

I did a cursory lookup for where it is set, and I think the places that need to be changed are SparkContext.stop() and YARN's Client.scala.

Doing it in SparkContext's creating is an option, but feels a little weird; in that case I'd rather set it to 1 or 0 (or some other boolean value) to indicate whether it's running in YARN mode, but that would be a much bigger change.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 25, 2015

@vanzin so it seems like if I do it in SparkContext shutdown that should be sufficient for all cases?

@vanzin
Copy link
Contributor

vanzin commented Sep 25, 2015

I don't think so. Client.scala, for better or for worse, is still a public API. So you can submit a yarn-cluster job by calling Client.scala directly, and that would leave SPARK_YARN_MODE set.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 25, 2015

ah that makes sense, I guess I forgot the Client was a public API.

@holdenk holdenk changed the title [SPARK-10812][YARN][WIP] Spark hadoop util support switching to yarn [SPARK-10812][YARN] Spark hadoop util support switching to yarn Sep 25, 2015
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be super paranoid, I'd do this before the previous line.

@vanzin
Copy link
Contributor

vanzin commented Sep 25, 2015

LGTM aside from two minor things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're here, comparing ...getClass === classOf[...] would be better IMO. Fewer magic strings.

@SparkQA
Copy link

SparkQA commented Sep 26, 2015

Test build #43035 has finished for PR 8911 at commit d9ca925.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 27, 2015

@vanzin updated with the suggested changes :)

@vanzin
Copy link
Contributor

vanzin commented Sep 28, 2015

LGTM, merging to master. Thanks!

@asfgit asfgit closed this in d8d50ed Sep 28, 2015
asfgit pushed a commit that referenced this pull request Oct 22, 2015
While this is likely not a huge issue for real production systems, for test systems which may setup a Spark Context and tear it down and stand up a Spark Context with a different master (e.g. some local mode & some yarn mode) tests this cane be an issue. Discovered during work on spark-testing-base on Spark 1.4.1, but seems like the logic that triggers it is present in master (see SparkHadoopUtil object). A valid work around for users encountering this issue is to fork a different JVM, however this can be heavy weight.

```
[info] SampleMiniClusterTest:
[info] Exception encountered when attempting to run a suite with class name: com.holdenkarau.spark.testing.SampleMiniClusterTest *** ABORTED ***
[info] java.lang.ClassCastException: org.apache.spark.deploy.SparkHadoopUtil cannot be cast to org.apache.spark.deploy.yarn.YarnSparkHadoopUtil
[info] at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.get(YarnSparkHadoopUtil.scala:163)
[info] at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:257)
[info] at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:561)
[info] at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:115)
[info] at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
[info] at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
[info] at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
[info] at com.holdenkarau.spark.testing.SharedMiniCluster$class.setup(SharedMiniCluster.scala:186)
[info] at com.holdenkarau.spark.testing.SampleMiniClusterTest.setup(SampleMiniClusterTest.scala:26)
[info] at com.holdenkarau.spark.testing.SharedMiniCluster$class.beforeAll(SharedMiniCluster.scala:103)
```

Author: Holden Karau <[email protected]>

Closes #8911 from holdenk/SPARK-10812-spark-hadoop-util-support-switching-to-yarn.

(cherry picked from commit d8d50ed)
@stevenmanton
Copy link

I'm running into this issue when running tests using pytest on pyspark with version 1.4.1. Is there a workaround I can use in pyspark in the meantime before we're able to upgrade to 1.5.2/1.6 to benefit from this fix?

@holdenk
Copy link
Contributor Author

holdenk commented Dec 10, 2015

@stevenmanton that question probably belongs more on the user list - but I'd say maybe just don't use yarn mode for your tests.

@stevenmanton
Copy link

Thanks @holdenk. It ended up being a simple fix. I'll follow up with mailing list for any other questions.

ashangit pushed a commit to ashangit/spark that referenced this pull request Oct 19, 2016
While this is likely not a huge issue for real production systems, for test systems which may setup a Spark Context and tear it down and stand up a Spark Context with a different master (e.g. some local mode & some yarn mode) tests this cane be an issue. Discovered during work on spark-testing-base on Spark 1.4.1, but seems like the logic that triggers it is present in master (see SparkHadoopUtil object). A valid work around for users encountering this issue is to fork a different JVM, however this can be heavy weight.

```
[info] SampleMiniClusterTest:
[info] Exception encountered when attempting to run a suite with class name: com.holdenkarau.spark.testing.SampleMiniClusterTest *** ABORTED ***
[info] java.lang.ClassCastException: org.apache.spark.deploy.SparkHadoopUtil cannot be cast to org.apache.spark.deploy.yarn.YarnSparkHadoopUtil
[info] at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.get(YarnSparkHadoopUtil.scala:163)
[info] at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:257)
[info] at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:561)
[info] at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:115)
[info] at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
[info] at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
[info] at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
[info] at com.holdenkarau.spark.testing.SharedMiniCluster$class.setup(SharedMiniCluster.scala:186)
[info] at com.holdenkarau.spark.testing.SampleMiniClusterTest.setup(SampleMiniClusterTest.scala:26)
[info] at com.holdenkarau.spark.testing.SharedMiniCluster$class.beforeAll(SharedMiniCluster.scala:103)
```

Author: Holden Karau <[email protected]>

Closes apache#8911 from holdenk/SPARK-10812-spark-hadoop-util-support-switching-to-yarn.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants