Skip to content

Conversation

@chuxi
Copy link

@chuxi chuxi commented Aug 21, 2014

For example, "arrayOfStruct" is an array of structs and every element of this array has a field called "field1". "arrayOfStruct[0].field1" means to access the value of "field1" for the first element of "arrayOfStruct", but the SQL parser (in sql-core) treats "field1" as an alias. Also, "arrayOfStruct.field1" means to access all values of "field1" in this array of structs and the returns those values as an array. But, the SQL parser cannot resolve it.

I have passed the test case in JsonSuite ("Complex field and type inferring (Ignored)") which is ignored, by a little modified.
modified test part :
checkAnswer(
sql("select arrayOfStruct.field1, arrayOfStruct.field2 from jsonTable"),
(Seq(true, false, null), Seq("str1", null, null)) :: Nil
)
However, another question is repeated nested structure is a problem, like arrayOfStruct.field1.arrayOfStruct.field1 or arrayOfStruct[0].field1.arrayOfStruct[0].field1
I plan to ignore this problem and try to add "select arrayOfStruct.field1, arrayOfStruct.field2 from jsonTable where arrayOfStruct.field1==true "
Besides, my friend anyweil (Wei Li) solved the problem of arrayOfStruct.field1 and its Filter part( means where parsing).
I am fresh here but will continue working on spark :)

I checked the problem " where arrayOfStruct.field1==true "
this problem will lead to modify every kind of comparisonExpression. And I think it makes no sense to add this function. So I discard it.
Over.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you changing the test case since it still cannot work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wang pangzi, someone add field3 in testData arrayOfStruct. So it requires another null.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add sql("select arrayOfStruct.field1 from jsonTable where arrayOfStruct.field1 = true") this test case in ignored part. It does not work because I came up with it but did not solve it. Or it makes no sense to solve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End files with a newline. Run sbt/sbt scalastyle to check style locally.

@marmbrus
Copy link
Contributor

Thanks for working on this. I made a few small suggestions.

@marmbrus
Copy link
Contributor

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Aug 26, 2014

QA tests have started for PR 2082 at commit ebf033b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 26, 2014

QA tests have finished for PR 2082 at commit ebf033b.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetArrayField(child: Expression, fieldName: String) extends UnaryExpression
    • case class ExplainCommand(plan: LogicalPlan, extended: Boolean = false) extends Command

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent too much

@chuxi
Copy link
Author

chuxi commented Aug 26, 2014

Thank you, marmbrus, you are so nice. I am fresh here and never post any PR to a open project. I will take your suggestions and modify my code as the scala style.

@chuxi
Copy link
Author

chuxi commented Aug 28, 2014

Jenkins, test this please.

@marmbrus
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have started for PR 2082 at commit e5a3db1.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 29, 2014

QA tests have finished for PR 2082 at commit e5a3db1.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class GetArrayField(child: Expression, fieldName: String) extends UnaryExpression

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just use

value.map{ t =>
  if (t == null) null else t(ordinal)
}

as the last line of this eval function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= =

@chuxi
Copy link
Author

chuxi commented Sep 2, 2014

@marmbrus , do I need to check something else? Or merge the code?

Besides, I see another PR : #2230

:) it is my friend, I suggested him have a look of the nested parquet sqlpaser in the sql parquet test suit which parses dot as a delimiter, not identChar.

it has a problem that we don't know whether the dot should be in identchar or delimiter. It leads to different sql parsing result. If only struct in a json or parquet, apparently put the dot. in identchar is better. Does it have some other reasons about the dot parsing?

@marmbrus
Copy link
Contributor

marmbrus commented Sep 3, 2014

Hi @chuxi, it would be great if you could discuss with @cloud-fan and perhaps adapt your GetArrayField stuff to work with the changes in #2230. Also I think there are a few unaddressed comments.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

Can one of the admins verify this patch?

@chuxi chuxi closed this Sep 16, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants