[SPARK-16456][SQL] Reuse the uncorrelated scalar subqueries with the same logical plan in a query #14111

lianhuiwang · 2016-07-09T04:21:59Z

What changes were proposed in this pull request?

In TPCDS-Q14 the same physical plan of uncorrelated scalar subqueries from a CTE could be executed multiple times, we should re-use the same result to avoid the duplicated computing.
Before

scala> (1 to 3).map(i => (i, i)).toDF("key", "value").createOrReplaceTempView("t1")
scala> sql("WITH max_test AS( SELECT max(key) as max_key FROM t ) SELECT key FROM t1 WHERE key = (SELECT max_key FROM max_test) and value = (SELECT max_key FROM max_test)").explain
== Physical Plan ==
*Project [_1#200 AS key#203]
+- *Filter ((_1#200 = subquery#209) && (_2#201 = subquery#210))
   :  :- Subquery subquery#209
   :  :  +- *HashAggregate(keys=[], functions=[max(key#203)], output=[max_key#211])
   :  :     +- Exchange SinglePartition
   :  :        +- *HashAggregate(keys=[], functions=[partial_max(key#203)], output=[max#217])
   :  :           +- LocalTableScan [key#203]
   :  +- Subquery subquery#210
   :     +- *HashAggregate(keys=[], functions=[max(key#203)], output=[max_key#211])
   :        +- Exchange SinglePartition
   :           +- *HashAggregate(keys=[], functions=[partial_max(key#203)], output=[max#219])
   :              +- LocalTableScan [key#203]
   +- LocalTableScan [_1#200, _2#201]

After

scala> (1 to 3).map(i => (i, i)).toDF("key", "value").createOrReplaceTempView("t1")
scala> sql("WITH max_test AS( SELECT max(key) as max_key FROM t ) SELECT key FROM t1 WHERE key = (SELECT max_key FROM max_test) and value = (SELECT max_key FROM max_test)").explain
== Physical Plan ==
*Project [_1#200 AS key#203]
+- *Filter ((_1#200 = subquery#209) && (_2#201 = ReusedSubquery#210(subquery#209)))
   :  +- Subquery subquery#209
   :     +- *HashAggregate(keys=[], functions=[max(key#203)], output=[max_key#211])
   :        +- Exchange SinglePartition
   :           +- *HashAggregate(keys=[], functions=[partial_max(key#203)], output=[max#217])
   :              +- LocalTableScan [key#203]
   +- LocalTableScan [_1#200, _2#201]

How was this patch tested?

Pass the Jenkins tests (including a new testcase).

SparkQA · 2016-07-09T05:48:49Z

Test build #62007 has finished for PR 14111 at commit 5290e42.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class ReusedScalarSubquery(

SparkQA · 2016-07-09T07:01:34Z

Test build #62010 has finished for PR 14111 at commit b1914de.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-09T10:53:01Z

Test build #62017 has finished for PR 14111 at commit 77ea002.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-09T11:05:19Z

Test build #62018 has finished for PR 14111 at commit 0311542.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-09T11:59:59Z

Test build #62021 has finished for PR 14111 at commit c4bb273.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-10T14:48:18Z

Test build #62056 has finished for PR 14111 at commit 1d7bd3c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

lianhuiwang · 2016-07-13T02:39:15Z

cc @rxin @hvanhovell @cloud-fan

cloud-fan · 2016-07-13T03:46:18Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala

  def subqueries: Seq[PlanType] = {
-    expressions.flatMap(_.collect {case e: SubqueryExpression => e.plan.asInstanceOf[PlanType]})
+    expressions.flatMap(_.collect {
+      case e: SubqueryExpression => e


should we use ExpressionCanonicalizer to canonicalize the expression before call distinct?

because we use ReusedScalarSubquery, not Alias to indicate the reused SubqueryExpression, I think we don't use ExpressionCanonicalizer.

cloud-fan · 2016-07-13T04:07:37Z

I have a simpler idea to implement this feature:

in SparkPlan, build a map from distinct subqueries to future result, and still keep the original subquery list.
in SparkPlan.prepareSubqueries, iterate the map and execute these distinct subqueries.
in SparkPlan.waitForSubqueries, iterate the original subquery list, get the result by searching the map, and then call updateResult.

I think this approach need much less code change, what do you think?

lianhuiwang · 2016-07-13T07:55:01Z

@cloud-fan At firstly I have implemented it with you said. But the following situation that has broadcast join will have a error 'ScalarSubquery has not finished', example (from SPARK-14791):
val df = (1 to 3).map(i => (i, i)).toDF("key", "value")
df.createOrReplaceTempView("t1")
df.createOrReplaceTempView("t2")
df.createOrReplaceTempView("t3")
val q = sql("select * from t1 join (select key, value from t2 " +
" where key > (select avg (key) from t3))t on (t1.key = t.key)")
Before:

*BroadcastHashJoin [key#5], [key#26], Inner, BuildRight
:- *Project [_1#2 AS key#5, _2#3 AS value#6]
:  +- *Filter (cast(_1#2 as double) > subquery#13)
:     :  +- Subquery subquery#13
:     :     +- *HashAggregate(keys=[], functions=[avg(cast(key#5 as bigint))], output=[avg(key)#25])
:     :        +- Exchange SinglePartition
:     :           +- *HashAggregate(keys=[], functions=[partial_avg(cast(key#5 as bigint))], output=[sum#30, count#31L])
:     :              +- LocalTableScan [key#5]
:     +- LocalTableScan [_1#2, _2#3]
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)))
   +- *Project [_1#2 AS key#26, _2#3 AS value#27]
      +- *Filter (cast(_1#2 as double) > subquery#13)
         :  +- Subquery subquery#13
         :     +- *HashAggregate(keys=[], functions=[avg(cast(key#5 as bigint))], output=[avg(key)#25])
         :        +- Exchange SinglePartition
         :           +- *HashAggregate(keys=[], functions=[partial_avg(cast(key#5 as bigint))], output=[sum#30, count#31L])
         :              +- LocalTableScan [key#5]
         +- LocalTableScan [_1#2, _2#3]

The steps are as follows:

BroadcastHashJoin.prepare()
t1.Filter.prepareSubqueries, it will prepare subquery.
BroadcastExchange.prepare()
t2.Filter.prepareSubqueries, it will prepare subquery.
BroadcastExchange.doPrepare(), it is in prepare() and will call child.executeCollect().
t2.Filter.execute()
t2.Filter.waitForSubqueries(), it will wait for subquery.
BroadcastHashJoin.doExecute()
BroadcastExchange.executeBroadcast()
t1.Filter.execute()
t1.Filter.waitForSubqueries(), it will wait for subquery.
Before this PR there are two different subqueries, they cannot wait for other's results.

But after this PR, they are the same subquery, the steps are as follows:

t1.Filter.prepareSubqueries, it will prepare subquery.
t2.Filter.prepareSubqueries, it will do not submit subquery's execute().
t2.Filter.waitForSubqueries(), it will can wait for subquery that step-1 have submitted before.
t1.Filter.waitForSubqueries(), it do not await subquery's results because step-3 have updated.
So I make some logical codes to ScalarSubquery in order to deal with it.

cloud-fan · 2016-07-25T13:09:57Z

@lianhuiwang, after taking a look at this example, I think this is a very special case: 2 physical plans reference to one same subquery(same instance). However, I don't think this is a valid case, I'd rather treat it as a bug of constraints propagation. Except this case, do we have another case that the simpler approach can't handle?

lianhuiwang · 2016-07-25T14:43:03Z

@cloud-fan I don't think it is a bug of constraints propagation because filter with the uncorrelated scalar subquery needs to push down due to it can filter many records.
In addition, the following query(more like TPCDS-Q14):

with  avg_table as
(select avg (key) as avg_key from t3)
select 1 as col
from t3 where key > (select avg_key from avg_table)
union all
select 1 as col from t1 join (select key, value from t2
where key > (select avg_key from avg_table))t on (t1.key = t.key)

When BroadcastExchangeExec has one same subquery that also appears in other places of this query, The first place will at firstly prepare subquery, But the second place in BroadcastExchangeExec will firstly wait for subquery because BroadcastExchangeExec.doPrepare will execute child plan.
So BroadcastExchangeExec's child plan needs to wait for the subquery Results that the first place has been submitted.

cloud-fan · 2016-07-26T02:52:30Z

ah you are right, I think we need a better approach to execute subqueries and reuse the results globally. cc @hvanhovell to comment more on this.

viirya · 2016-08-12T03:54:55Z

I think this is duplicate to the merged one #14548?

hvanhovell · 2016-08-30T20:17:25Z

@lianhuiwang we have merged PR #14548 implementing similar functionality. Could you close this PR? Thanks for your work!

lianhuiwang · 2016-09-01T01:45:15Z

OK. Thanks.

lianhuiwang added 2 commits July 9, 2016 12:18

init commit

5290e42

fix explain

b1914de

lianhuiwang added 3 commits July 9, 2016 17:07

fix bug

77ea002

fix minor

0311542

fix transient

c4bb273

fix style

1d7bd3c

cloud-fan reviewed Jul 13, 2016
View reviewed changes

lianhuiwang closed this Sep 1, 2016

[SPARK-16456][SQL] Reuse the uncorrelated scalar subqueries with the same logical plan in a query #14111

[SPARK-16456][SQL] Reuse the uncorrelated scalar subqueries with the same logical plan in a query #14111

Uh oh!

Conversation

lianhuiwang commented Jul 9, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jul 9, 2016

Uh oh!

SparkQA commented Jul 9, 2016

Uh oh!

SparkQA commented Jul 9, 2016

Uh oh!

SparkQA commented Jul 9, 2016

Uh oh!

SparkQA commented Jul 9, 2016

Uh oh!

SparkQA commented Jul 10, 2016

Uh oh!

lianhuiwang commented Jul 13, 2016

Uh oh!

cloud-fan Jul 13, 2016

Choose a reason for hiding this comment

Uh oh!

lianhuiwang Jul 13, 2016

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Jul 13, 2016

Uh oh!

lianhuiwang commented Jul 13, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Jul 25, 2016

Uh oh!

lianhuiwang commented Jul 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Jul 26, 2016

Uh oh!

viirya commented Aug 12, 2016

Uh oh!

hvanhovell commented Aug 30, 2016

Uh oh!

lianhuiwang commented Sep 1, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lianhuiwang commented Jul 9, 2016 •

edited

Loading

lianhuiwang commented Jul 13, 2016 •

edited

Loading

lianhuiwang commented Jul 25, 2016 •

edited

Loading