Skip to content

Conversation

@felixcheung
Copy link
Member

The fix is to coerce c("a", "b") into a list such that it could be serialized to call JVM with.

@shivaram
Copy link
Contributor

shivaram commented Oct 2, 2015

Jenkins, ok to test

@SparkQA
Copy link

SparkQA commented Oct 2, 2015

Test build #43168 has finished for PR 8961 at commit 41b5673.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

shivaram commented Oct 2, 2015

Thanks @felixcheung - Fix looks good to me.
cc @sun-rui -- Is there a slightly better fix with the new SerDe we have.

Also just a note to myself that we should backport this branch-1.5

@sun-rui
Copy link
Contributor

sun-rui commented Oct 2, 2015

@felixcheung, could you verify that your PR works when there are columns more than 2? for example, select(df, c("c1", "c2", "c3")). I suspect that it does not work. Note that the signature of the Scala select:
select(col:String, cols: String*)

If we want to support the case where the parameter col is a vector, I think:

  1. extract the col[[1]]
  2. merge the remaining col[2:length(col)] and the possible ... varargs.

@sun-rui
Copy link
Contributor

sun-rui commented Oct 2, 2015

My fault, ignore my comment above. I did not realize that there is select(df, list) already in SparkR.

This is irrelevant to the SerDe enhancement. This is basically OK. The remaining question is how to handle the ... varargs in this case? for example, select(df, c("c1","c2"), "c3", "c4")? Do we allow such invokation of select?

@felixcheung
Copy link
Member Author

We could certainly handle select(df, c("c1","c2"), "c3", "c4"), by changing the code to

            if (length(col) > 1) {
                select(x, append(as.list(col), list(...)))

This seems like an edge case - should we do that?

@shivaram
Copy link
Contributor

shivaram commented Oct 2, 2015

I'm fine with not supporting the edge cases as long as we can throw a reasonable error message.

@felixcheung
Copy link
Member Author

ok, either case would be similar amount of code then. Please advise.

@shivaram
Copy link
Contributor

shivaram commented Oct 2, 2015

I'd say lets keep the API simple and only support a vector or varargs but not both.

@felixcheung
Copy link
Member Author

Updated:

> head(select(df,c("name", "age"), "abc"))
Error in head(select(df, c("name", "age"), "abc")) :
  error in evaluating the argument 'x' in selecting a method for function 'head': Error in select(df, c("name", "age"), "abc") :
  To select multiple columns, use a character vector or list for col
> head(select(df,c("name", "age"), "abc", "bc"))
Error in head(select(df, c("name", "age"), "abc", "bc")) :
  error in evaluating the argument 'x' in selecting a method for function 'head': Error in select(df, c("name", "age"), "abc", "bc") :
  To select multiple columns, use a character vector or list for col
> head(select(df,c("name", "age"), c("abc", "bc")))
Error in head(select(df, c("name", "age"), c("abc", "bc"))) :
  error in evaluating the argument 'x' in selecting a method for function 'head': Error in select(df, c("name", "age"), c("abc", "bc")) :
  To select multiple columns, use a character vector or list for col

@SparkQA
Copy link

SparkQA commented Oct 2, 2015

Test build #43200 has finished for PR 8961 at commit dfac25c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 3, 2015

Test build #43204 has finished for PR 8961 at commit d4dd39c.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 3, 2015

Test build #43205 has finished for PR 8961 at commit 4f8f170.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sun-rui
Copy link
Contributor

sun-rui commented Oct 3, 2015

LGTM

@shivaram
Copy link
Contributor

shivaram commented Oct 3, 2015

Thanks @felixcheung @sun-rui -- LGTM. Merging this

@asfgit asfgit closed this in 721e8b5 Oct 4, 2015
asfgit pushed a commit that referenced this pull request Oct 4, 2015
The fix is to coerce `c("a", "b")` into a list such that it could be serialized to call JVM with.

Author: felixcheung <[email protected]>

Closes #8961 from felixcheung/rselect.

(cherry picked from commit 721e8b5)
Signed-off-by: Shivaram Venkataraman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants