Skip to content

Conversation

@thomastechs
Copy link
Contributor

When trying to select from the data frame which contains the columns with . in it, it is throwing exception.
Added the fix in the select method to incorporate the selection of columns containing . in it

Editing the select method to allow the select of columns with . in it.
Added testcases for SPARK-13197 fix - selecting the dataframe with columns containing . in it
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@hvanhovell
Copy link
Contributor

I think the same problem is solved in: #10943

@thomastechs
Copy link
Contributor Author

@hvanhovell and @jayadevanmurali We checked the source code change in the referenced PR@10943. That PR is resolving the similar issue in the drop scenario. But this current fix would be for the explicit select statements as stated in our bug.Also we double checked whether the fix @PR-10943 would resolve the select issue . But, it does not resolve this bug. So ,please let us know your comments.

@clockfly
Copy link
Contributor

clockfly commented Jun 13, 2016

@thomastechs

We should use df.select("a.c") to select a column with name "a.c".
The reason is that we can use df.select("path.to.column") to select a nested column, for example:

scala> case class A(inner: Int)
scala> val df = Seq((A(1), 2)).toDF("a", "b")
scala> df.select("a.inner").show()
+-----+
|inner|
+-----+
|    1|
+-----+

So, I think this is NOT a bug, current behavior is expected.

@HyukjinKwon
Copy link
Member

+1 for not a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants