Skip to content

Conversation

@gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Oct 13, 2018

What changes were proposed in this pull request?

Add outputOrdering to otherCopyArgs in InMemoryRelation so that this field will be copied when we doing the tree transformation.

    val data = Seq(100).toDF("count").cache()
    data.queryExecution.optimizedPlan.toJSON

The above code can generate the following error:

assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
java.lang.AssertionError: assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonFields(TreeNode.scala:611)
	at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$collectJsonValue$1(TreeNode.scala:599)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonValue(TreeNode.scala:604)
	at org.apache.spark.sql.catalyst.trees.TreeNode.toJSON(TreeNode.scala:590)

How was this patch tested?

Added a test

@mgaido91
Copy link
Contributor

thanks for this fix @gatorsmile. Can we add a UT for this? Moreover, shall we add it also to LogicalRDD ?

@gatorsmile
Copy link
Member Author

@mgaido91 Anything is missing in LogicalRDD?

@SparkQA
Copy link

SparkQA commented Oct 13, 2018

Test build #97346 has finished for PR 22715 at commit 1d30cc5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile gatorsmile changed the title [SPARK-23375][FOLLOW-UP] Add outputOrdering to otherCopyArgs [SPARK-25727] Add outputOrdering to otherCopyArgs Oct 13, 2018
@gatorsmile gatorsmile changed the title [SPARK-25727] Add outputOrdering to otherCopyArgs [SPARK-25727][SQL] Add outputOrdering to otherCopyArgs Oct 13, 2018
@gatorsmile gatorsmile changed the title [SPARK-25727][SQL] Add outputOrdering to otherCopyArgs [SPARK-25727][SQL] Add outputOrdering to otherCopyArgs in InMemoryRelation Oct 14, 2018
@SparkQA
Copy link

SparkQA commented Oct 14, 2018

Test build #97350 has finished for PR 22715 at commit cff78e9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@dongjoon-hyun
Copy link
Member

Merged to master/branch-2.4.

asfgit pushed a commit that referenced this pull request Oct 14, 2018
…ation

## What changes were proposed in this pull request?
Add `outputOrdering ` to `otherCopyArgs` in InMemoryRelation so that this field will be copied when we doing the tree transformation.

```
    val data = Seq(100).toDF("count").cache()
    data.queryExecution.optimizedPlan.toJSON
```

The above code can generate the following error:

```
assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
java.lang.AssertionError: assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonFields(TreeNode.scala:611)
	at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$collectJsonValue$1(TreeNode.scala:599)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonValue(TreeNode.scala:604)
	at org.apache.spark.sql.catalyst.trees.TreeNode.toJSON(TreeNode.scala:590)
```

## How was this patch tested?

Added a test

Closes #22715 from gatorsmile/copyArgs1.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 6c3f2c6)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Thank you, @gatorsmile and @mgaido91 .

@asfgit asfgit closed this in 6c3f2c6 Oct 14, 2018
}

override protected def otherCopyArgs: Seq[AnyRef] = Seq(statsOfPlanToCache)
override protected def otherCopyArgs: Seq[AnyRef] = Seq(statsOfPlanToCache, outputOrdering)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I don't understand is why we put the outputOrdering in the curry constructor at the first place...

@mgaido91
Copy link
Contributor

sorry @gatorsmile that is fine, my bad. A late LGTM to this, despite probably @cloud-fan 's comment make, we probably should have just put it as a part of the main constructor...

@cloud-fan
Copy link
Contributor

Hi @mgaido91 , since you are the major author of this part, do you have time to open a PR and move outputOrdering to the main constructor? Thanks in advance!

@mgaido91
Copy link
Contributor

@cloud-fan , sure, I'll submit a follow-up PR for this. Thanks.

asfgit pushed a commit that referenced this pull request Oct 15, 2018
…nMemoryRelation

## What changes were proposed in this pull request?

The PR addresses [the comment](#22715 (comment)) in the previous one. `outputOrdering` becomes a field of `InMemoryRelation`.

## How was this patch tested?

existing UTs

Closes #22726 from mgaido91/SPARK-25727_followup.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…ation

## What changes were proposed in this pull request?
Add `outputOrdering ` to `otherCopyArgs` in InMemoryRelation so that this field will be copied when we doing the tree transformation.

```
    val data = Seq(100).toDF("count").cache()
    data.queryExecution.optimizedPlan.toJSON
```

The above code can generate the following error:

```
assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
java.lang.AssertionError: assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonFields(TreeNode.scala:611)
	at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$collectJsonValue$1(TreeNode.scala:599)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonValue(TreeNode.scala:604)
	at org.apache.spark.sql.catalyst.trees.TreeNode.toJSON(TreeNode.scala:590)
```

## How was this patch tested?

Added a test

Closes apache#22715 from gatorsmile/copyArgs1.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…nMemoryRelation

## What changes were proposed in this pull request?

The PR addresses [the comment](apache#22715 (comment)) in the previous one. `outputOrdering` becomes a field of `InMemoryRelation`.

## How was this patch tested?

existing UTs

Closes apache#22726 from mgaido91/SPARK-25727_followup.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants