Skip to content

Conversation

@sryza
Copy link
Contributor

@sryza sryza commented Jun 27, 2014

...spark-submit

The PR allows invocations like
spark-submit --class org.MyClass --spark.shuffle.spill false myjar.jar

@sryza
Copy link
Contributor Author

sryza commented Jun 27, 2014

Verified this on a pseudo-distributed cluster

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16222/

@pwendell
Copy link
Contributor

Hey @sryza - I did a straw poll offline discussing this with a few other contributors. The consensus was that it might be better to have a --conf flag with an = sign instead of representing spark conf properties directly as flags.

I.e. --conf spark.app.name=blah

On admittedly bad thing about this approach is that if users have arguments with spaces in them, they will have to quote the entire thing:

./bin/spark-submit --conf "spark.app.name=My app"

Which might not be intuitive, so it would be good to document that (provided you are generally okay with this proposed syntax).

@lianhuiwang
Copy link
Contributor

how about -Dspark.app.name=blah? because in jvm or Hadoop, they use -D flag to represent conf properties.

@srowen
Copy link
Member

srowen commented Jul 20, 2014

-D feels more natural indeed; I would expect those args to be passed through to the JVM as-is. Because that's a way to set these env properties too right? In fact, wouldn't it be nicer to just let any -D arg through?

@pwendell
Copy link
Contributor

IMO -D does not have the right semantics here because the user isn't logically setting java properties for the submission tool, they are setting spark configuration properties for their application. The application might run totally remotely, for instance, so why should the user expect that a -D set on the submission site gets packaged up and sent to the remote launcher. Also confusing is that we'd only really triage -D options that are spark properties, not other ones, so the semantics would differ depending on whether the user happened to set a java property that started with spark. For these reasons I feel it's better to just have an explicit config-related flag.

@mateiz
Copy link
Contributor

mateiz commented Jul 21, 2014

I agree, -D is for JVM options, but these are not arbitrary JVM options.

@srowen
Copy link
Member

srowen commented Jul 21, 2014

Good points. I meant triaging all -D options but yes those then have very 'local' semantics.

@sryza
Copy link
Contributor Author

sryza commented Jul 21, 2014

Updated patch to use --conf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here, it might be a bit nicer to do something like:

value match {
  case k :: "=" :: v =>
    sparkProperties(k) = v
  case _ =>
    // throw exception saying there is a bad format
}

Would this work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That didn't compile for me. As far as I can tell you can't do pattern matching on strings in that way. Switching it to something that looks a little prettier. Let me know if I'm missing anything.

@sryza
Copy link
Contributor Author

sryza commented Jul 21, 2014

Updated patch addresses Patrick's feedback

@SparkQA
Copy link

SparkQA commented Jul 21, 2014

QA tests have started for PR 1253. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16934/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA results for PR 1253:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16934/consoleFull

@pwendell
Copy link
Contributor

LGTM - @mateiz @rxin any final comments here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want a val here?

@mateiz
Copy link
Contributor

mateiz commented Jul 22, 2014

Can you support -c in addition to --conf?

Also, the spark-submit doc (http://spark.apache.org/docs/latest/submitting-applications.html) should be updated to list this option.

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA tests have started for PR 1253. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16971/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA results for PR 1253:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16971/consoleFull

@sryza
Copy link
Contributor Author

sryza commented Jul 22, 2014

The failure appears to be unrelated (MIMA compatibility issue in MLLib).

@pwendell
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA tests have started for PR 1253. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16988/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA results for PR 1253:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16988/consoleFull

@asfgit asfgit closed this in e34922a Jul 24, 2014
@pwendell
Copy link
Contributor

@sryza I merged this just now because another patch was going to change this code and I wanted to avoid you having to rebase again. That said, I found an issue with this after merging. Would you be able to fix this?

https://issues.apache.org/jira/browse/SPARK-2664

@sryza
Copy link
Contributor Author

sryza commented Jul 24, 2014

Thanks, I'm on it.

xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…th ...

...spark-submit

The PR allows invocations like
  spark-submit --class org.MyClass --spark.shuffle.spill false myjar.jar

Author: Sandy Ryza <[email protected]>

Closes apache#1253 from sryza/sandy-spark-2310 and squashes the following commits:

1dc9855 [Sandy Ryza] More doc and cleanup
00edfb9 [Sandy Ryza] Review comments
91b244a [Sandy Ryza] Change format to --conf PROP=VALUE
8fabe77 [Sandy Ryza] SPARK-2310. Support arbitrary Spark properties on the command line with spark-submit
sunchao pushed a commit to sunchao/spark that referenced this pull request Dec 8, 2021
### What changes were proposed in this pull request?

This PR adds logic for rewriting row-level commands in 3.2.

### Why are the changes needed?

These changes are needed to support DELETE, UPDATE, MERGE commands in Iceberg.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Locally and tests in Iceberg.
mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025
… blacklisted files (apache#1253)

* [classpathfilter] Do not expand classpath entries which don't contain blacklisted files

* [classpathfilter] Remove unneeded checks

* [classpathfilter] Don't check for duplicates to keep everything as simple as possible
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants