Skip to content

Conversation

@andrewor14
Copy link
Contributor

This is an effort to bring the Windows scripts up to speed after recent splashing changes in #1845.

Note that this is still currently broken. There is an issue with
using SparkSubmitDriverBootstrapper with windows; the stdin is not
being picked up properly by the SparkSubmit subprocess. This must
be fixed before the PR is merged.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not working yet. See commit message andrewor14@83ebe60 for more detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed in this commit andrewor14@f97daa2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JK, this was actually fixed in andrewor14@35caecc

@SparkQA
Copy link

SparkQA commented Aug 26, 2014

QA tests have started for PR 2129 at commit 83ebe60.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 26, 2014

QA tests have finished for PR 2129 at commit 83ebe60.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • rem In this case, leave out the main class (org.apache.spark.deploy.SparkSubmit) and use our own.

It turns out that java.lang.Process reads directly from the
parent process' stdin on Windows. This means we should avoid
spawning a thread that also attempts to redirect System.in to
the subprocess (in vain) and contends with the subprocess in
reading System.in.

This raises an issue with knowing when to terminate the JVM
in the PySpark shell, however, where Java itself is a python
subprocess. We previously relied on the Java process killing
itself on broken pipe, but this mechanism is not available on
Windows since we no longer read from System.in for the EOF.
Instead, in this environment we rely on python's shutdown
hook to kill the child process.
Previously we only killed the surface-level "spark-submit.cmd"
command. We need to go all the way and kill its children too.
@andrewor14 andrewor14 changed the title [WIP][SPARK-3167] Handle special driver configs in Windows [SPARK-3167] Handle special driver configs in Windows Aug 27, 2014
@andrewor14
Copy link
Contributor Author

Windows, test this please.

This was simply missing in the existing code.
If you `set` a variable if a conditional, you have to use !VAR!
instead of %VAR% after enabling delayed expansion. Don't ask.
@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have started for PR 2129 at commit 72004c2.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have finished for PR 2129 at commit 72004c2.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • rem In this case, leave out the main class (org.apache.spark.deploy.SparkSubmit) and use our own.

@andrewor14
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have started for PR 2129 at commit afcffea.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have finished for PR 2129 at commit afcffea.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • rem In this case, leave out the main class (org.apache.spark.deploy.SparkSubmit) and use our own.

@andrewor14
Copy link
Contributor Author

Hey Jenkins, test this please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !VAR_NAME! syntax is explained here: andrewor14@72004c2. If we used %VAR_NAME%, this wouldn't pick up the latest value set in L76 because we're inside a conditional.

@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have started for PR 2129 at commit 22b1acd.

  • This patch merges cleanly.

@andrewor14
Copy link
Contributor Author

-- Update --

As of the latest commit, I have tested this with setting the --driver-* options and the corresponding spark.driver.* configs. The order of precedence is the same as that established in #1845, i.e. the command line arguments override the values in the properties file. I have also verified that all subprocesses terminate after the parent process exits in both Scala and Python.

This is more or less ready from my side, though others should definitely review this. @JoshRosen It would be great if you can test this too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any public location where this behavior is specified (i.e. for a developer doc?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm aware of :/

@pwendell
Copy link
Contributor

Hey @andrewor14 - a few minor comments, but otherwise LGTM. At this point you are the foremost export on windows scripting, so can't add much value reviewing that.

@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have finished for PR 2129 at commit 22b1acd.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • rem In this case, leave out the main class (org.apache.spark.deploy.SparkSubmit) and use our own.

@andrewor14
Copy link
Contributor Author

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have started for PR 2129 at commit 881a8f0.

  • This patch merges cleanly.

@pwendell
Copy link
Contributor

Okay I'm gonna merge this - thanks Andrew.

@asfgit asfgit closed this in 7557c4c Aug 27, 2014
andrewor14 added a commit to andrewor14/spark that referenced this pull request Aug 27, 2014
This is an effort to bring the Windows scripts up to speed after recent splashing changes in apache#1845.

Author: Andrew Or <[email protected]>

Closes apache#2129 from andrewor14/windows-config and squashes the following commits:

881a8f0 [Andrew Or] Add reference to Windows taskkill
92e6047 [Andrew Or] Update a few comments (minor)
22b1acd [Andrew Or] Fix style again (minor)
afcffea [Andrew Or] Fix style (minor)
72004c2 [Andrew Or] Actually respect --driver-java-options
803218b [Andrew Or] Actually respect SPARK_*_CLASSPATH
eeb34a0 [Andrew Or] Update outdated comment (minor)
35caecc [Andrew Or] In Windows, actually kill Java processes on exit
f97daa2 [Andrew Or] Fix Windows spark shell stdin issue
83ebe60 [Andrew Or] Parse special driver configs in Windows (broken)

Conflicts:
	bin/spark-class2.cmd
@SparkQA
Copy link

SparkQA commented Aug 27, 2014

QA tests have finished for PR 2129 at commit 881a8f0.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • rem In this case, leave out the main class (org.apache.spark.deploy.SparkSubmit) and use our own.

@andrewor14 andrewor14 deleted the windows-config branch August 27, 2014 18:14
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
This is an effort to bring the Windows scripts up to speed after recent splashing changes in apache#1845.

Author: Andrew Or <[email protected]>

Closes apache#2129 from andrewor14/windows-config and squashes the following commits:

881a8f0 [Andrew Or] Add reference to Windows taskkill
92e6047 [Andrew Or] Update a few comments (minor)
22b1acd [Andrew Or] Fix style again (minor)
afcffea [Andrew Or] Fix style (minor)
72004c2 [Andrew Or] Actually respect --driver-java-options
803218b [Andrew Or] Actually respect SPARK_*_CLASSPATH
eeb34a0 [Andrew Or] Update outdated comment (minor)
35caecc [Andrew Or] In Windows, actually kill Java processes on exit
f97daa2 [Andrew Or] Fix Windows spark shell stdin issue
83ebe60 [Andrew Or] Parse special driver configs in Windows (broken)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants