[SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema #12338

davies · 2016-04-12T21:55:09Z

What changes were proposed in this pull request?

The wide schema, the expression of fields will be splitted into multiple functions, but the variable for loopVar can't be accessed in splitted functions, this PR change them as class member.

How was this patch tested?

Added regression test.

davies · 2016-04-12T21:55:18Z

cc @marmbrus @cloud-fan

SparkQA · 2016-04-12T23:29:03Z

Test build #55651 has finished for PR 12338 at commit 4a472de.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-04-13T00:07:29Z

LGTM

cloud-fan · 2016-04-13T00:10:43Z

One more question, actually there are some more places that we use local variables as the input for expressions, e.g. CodeGenContext.currentVars in whole stage codegen, LambdaVariables in MapElements and typed filter, should we make them all class members, or is there a more general way to do it?

davies · 2016-04-13T00:26:02Z

This is used for non-whole-stage codegen, also MapObjects does not support interpret mode, or it's better to fallback. Class member may prevent JIT compiler to do more optimization, so we should not aggresively put all of them as class members.

Merging this into master.

… type ## What changes were proposed in this pull request? After #12067, we now use expressions to do the aggregation in `TypedAggregateExpression`. To implement buffer merge, we produce a new buffer deserializer expression by replacing `AttributeReference` with right-side buffer attribute, like other `DeclarativeAggregate`s do, and finally combine the left and right buffer deserializer with `Invoke`. However, after #12338, we will add loop variable to class members when codegen `MapObjects`. If the `Aggregator` buffer type is `Seq`, which is implemented by `MapObjects` expression, we will add the same loop variable to class members twice(by left and right buffer deserializer), which cause the `ClassFormatError`. This PR fixes this issue by calling `distinct` before declare the class menbers. ## How was this patch tested? new regression test in `DatasetAggregatorSuite` Author: Wenchen Fan <[email protected]> Closes #12468 from cloud-fan/bug.

…with nested wide schema The wide schema, the expression of fields will be splitted into multiple functions, but the variable for loopVar can't be accessed in splitted functions, this PR change them as class member. Added regression test. Author: Davies Liu <[email protected]> Closes apache#12338 from davies/nested_row. Conflicts: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala

Davies Liu added 2 commits April 12, 2016 14:08

fix codegen of nested CreateExternalRow

1c42e2c

regression tests

4a472de

asfgit closed this in 372baf0 Apr 13, 2016

cloud-fan mentioned this pull request Apr 18, 2016

[SPARK-14675][SQL] ClassFormatError when use Seq as Aggregator buffer type #12468

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema #12338

[SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema #12338

Uh oh!

davies commented Apr 12, 2016

Uh oh!

davies commented Apr 12, 2016

Uh oh!

SparkQA commented Apr 12, 2016

Uh oh!

cloud-fan commented Apr 13, 2016

Uh oh!

cloud-fan commented Apr 13, 2016

Uh oh!

davies commented Apr 13, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema #12338

[SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wide schema #12338

Uh oh!

Conversation

davies commented Apr 12, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

davies commented Apr 12, 2016

Uh oh!

SparkQA commented Apr 12, 2016

Uh oh!

cloud-fan commented Apr 13, 2016

Uh oh!

cloud-fan commented Apr 13, 2016

Uh oh!

davies commented Apr 13, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants