Skip to content

Conversation

@yhuai
Copy link
Contributor

@yhuai yhuai commented Dec 1, 2015

In TreeNode's argString, if a TreeNode is not a child of the current TreeNode, we will only return the simpleString.

I tested the following case provided by Cristian.

val c = (1 to 20).foldLeft[Option[DataFrame]] (None) { (curr, idx) =>
    println(s"PROCESSING >>>>>>>>>>> $idx")
    val df = sqlContext.sparkContext.parallelize((0 to 10).zipWithIndex).toDF("A", "B")
    val union = curr.map(_.unionAll(df)).getOrElse(df)
    union.cache()
    Some(union)
  }

c.get.explain(true)

Without the change, c.get.explain(true) took 100s. With the change, c.get.explain(true) took 26ms.

https://issues.apache.org/jira/browse/SPARK-11596

…TreeNode, we will only return the simpleString.
@yhuai yhuai changed the title [SPARK-11596] [SQL] In TreeNode's argString, if a TreeNode is not a child of the current TreeNode, we will only return the simpleString. [SPARK-11596] [SQL] In TreeNode's argString, if a TreeNode is not a child of the current TreeNode, we should only return the simpleString. Dec 1, 2015
@SparkQA
Copy link

SparkQA commented Dec 2, 2015

Test build #47007 has finished for PR 10079 at commit b25b3de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Dec 2, 2015
…ild of the current TreeNode, we should only return the simpleString.

In TreeNode's argString, if a TreeNode is not a child of the current TreeNode, we will only return the simpleString.

I tested the [following case provided by Cristian](https://issues.apache.org/jira/browse/SPARK-11596?focusedCommentId=15019241&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15019241).
```
val c = (1 to 20).foldLeft[Option[DataFrame]] (None) { (curr, idx) =>
    println(s"PROCESSING >>>>>>>>>>> $idx")
    val df = sqlContext.sparkContext.parallelize((0 to 10).zipWithIndex).toDF("A", "B")
    val union = curr.map(_.unionAll(df)).getOrElse(df)
    union.cache()
    Some(union)
  }

c.get.explain(true)
```

Without the change, `c.get.explain(true)` took 100s. With the change, `c.get.explain(true)` took 26ms.

https://issues.apache.org/jira/browse/SPARK-11596

Author: Yin Huai <[email protected]>

Closes #10079 from yhuai/SPARK-11596.

(cherry picked from commit e96a70d)
Signed-off-by: Michael Armbrust <[email protected]>
@marmbrus
Copy link
Contributor

marmbrus commented Dec 2, 2015

Thanks, merged to master and 1.6

@asfgit asfgit closed this in e96a70d Dec 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants