-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-12019][SPARKR] Support character vector for sparkR.init(), check param and fix doc #10034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #46860 has finished for PR 10034 at commit
|
|
Hmm I think its better to just support both forms ? Isn't |
|
It is. I could add that. |
|
The example is correct. Jars is expected to a vector of character. The bug is at https://github.com/apache/spark/blob/master/R/pkg/R/client.R#L47. It is more natural to use character vector in R. I don't think it necessary to support both forms as it complicates the logic. |
|
I considered about changing it that way but thought that would break everyone who had gotten it to work with a comma-separated string. |
|
I could make it either way - let me know which way we should support. |
|
My personal opinion is that comma-speparated jars is for command line options, while it is natural to use vector/array in API. jars is Seq[String] in Scala SparkContext API, for example. |
|
Well the only reason to support both use cases it to have some backwards compatibility for users who might already be using this. Can we see how much more complex the code becomes ? Its good to not break compatibility if we can avoid it. |
|
it shouldn't be hard. assuming we don't plan to do very in-depth checks (eg. not removing leading/trailing space around |
|
@shivaram, from document's point of view, this is not backward compatibility issue: The comma-separated string happens to work because there is a redudant code logic that pass the string to SparkSubmit --jars option, which I think is not necessary, as SparkContext.addJar will be called. So I propose to remove the redudant code logic. |
|
Updated to support both Also refactored to allow better testing and reuse. |
|
Test build #47112 has finished for PR 10034 at commit
|
|
@shivaram we might want this PR in Spark 1.6 ... |
R/pkg/R/client.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems pretty subtle -- Can you add a comment just above saying Leave a space after --jars
|
Thanks @felixcheung -- This is very well done. I just had a couple of minor comments. You are right it will be good to have in 1.6 -- We'll merge this into branch-1.6 and if there is a RC2 we might catch 1.6.0. Lets see cc @marmbrus |
|
Test build #47166 has finished for PR 10034 at commit
|
|
LGTM. Merging this |
…ck param and fix doc and add tests. Spark submit expects comma-separated list Author: felixcheung <[email protected]> Closes #10034 from felixcheung/sparkrinitdoc. (cherry picked from commit 2213441) Signed-off-by: Shivaram Venkataraman <[email protected]>
and add tests.
Spark submit expects comma-separated list