[SPARK-28583][SQL] Subqueries should not call `onUpdatePlan` in Adaptive Query Execution #25316

maryannxue · 2019-07-31T16:38:30Z

What changes were proposed in this pull request?

Subqueries do not have their own execution id, thus when calling AdaptiveSparkPlanExec.onUpdatePlan, it will actually get the QueryExecution instance of the main query, which is wasteful and problematic. It could cause issues like stack overflow or dead locks in some circumstances.

This PR fixes this issue by making AdaptiveSparkPlanExec compare the QueryExecution object retrieved by current execution ID against the QueryExecution object from which this plan is created, and only update the UI when the two instances are the same.

How was this patch tested?

Manual tests on TPC-DS queries.

…tion

SparkQA · 2019-07-31T20:53:34Z

Test build #108479 has finished for PR 25316 at commit 97bce6f.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class InsertAdaptiveSparkPlan(

cloud-fan · 2019-08-01T08:25:16Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/InsertAdaptiveSparkPlan.scala

    // Apply the same instance of this rule to sub-queries so that sub-queries all share the
    // same `stageCache` for Exchange reuse.
-    val adaptivePlan = this.apply(queryExec.sparkPlan)
+    val adaptivePlan = this.applyInternal(queryExec.sparkPlan, queryExec)


When we reach here, it means we are creating AdaptiveSparkPlanExec for a subquery. Shall we simply set a boolean flag here (e.g. adaptivePlan.copy(isSubquery = true)) instead of passing around the QueryExecution?

nvm, this is more flexible, in case some places create QueryExecution without execution id and execute.

hvanhovell · 2019-08-01T19:40:24Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+        session.sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)).flatMap { idStr =>
+        val id = idStr.toLong
+        val qe = SQLExecution.getQueryExecution(id)
+        if (qe.eq(queryExecution)) Some(id) else None


Can you add some doc on why this is needed. It is kinda annoying when you have to check the git blame to figure out why code is there.

SparkQA · 2019-08-06T01:04:31Z

Test build #108685 has finished for PR 25316 at commit 1705242.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2019-08-06T02:04:16Z

LGTM if https://github.com/apache/spark/pull/25316/files#r309864918 is addressed

hvanhovell

LGTM

hvanhovell · 2019-08-07T20:10:33Z

Merging to master

SparkQA · 2019-08-07T20:57:20Z

Test build #108778 has finished for PR 25316 at commit f570281.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

### What changes were proposed in this pull request? After [PR#25316](#25316) fixed the dead lock issue in [PR#25308](#25308), the subquery metrics can not be shown in UI as following screenshot. ![image](https://user-images.githubusercontent.com/11972570/72891385-160ec980-3d4f-11ea-91fc-ccaad890f7dc.png) This PR fix the subquery UI shown issue by adding `SparkListenerSQLAdaptiveSQLMetricUpdates` event to update the suquery sql metric. After with this PR, the suquery UI can show correctly as following screenshot: ![image](https://user-images.githubusercontent.com/11972570/72893610-66d4f100-3d54-11ea-93c9-f444b2f31952.png) ### Why are the changes needed? Showing the subquery metric in UI when enable AQE ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing UT Closes #27260 from JkSelf/fixSubqueryUI. Authored-by: jiake <[email protected]> Signed-off-by: Xiao Li <[email protected]>

…ive Query Execution ## What changes were proposed in this pull request? Subqueries do not have their own execution id, thus when calling `AdaptiveSparkPlanExec.onUpdatePlan`, it will actually get the `QueryExecution` instance of the main query, which is wasteful and problematic. It could cause issues like stack overflow or dead locks in some circumstances. This PR fixes this issue by making `AdaptiveSparkPlanExec` compare the `QueryExecution` object retrieved by current execution ID against the `QueryExecution` object from which this plan is created, and only update the UI when the two instances are the same. ## How was this patch tested? Manual tests on TPC-DS queries. Closes apache#25316 from maryannxue/aqe-updateplan-fix. Authored-by: maryannxue <[email protected]> Signed-off-by: herman <[email protected]>

[SPARK-28583][SQL] Subqueries should not call in Adaptive Query Execu…

97bce6f

…tion

maryannxue mentioned this pull request Jul 31, 2019

[SPARK-28576][SQL] fix the dead lock issue when enable new adaptive execution #25308

Closed

dongjoon-hyun added the SQL label Jul 31, 2019

cloud-fan reviewed Aug 1, 2019

View reviewed changes

hvanhovell reviewed Aug 1, 2019

View reviewed changes

fix

1705242

address review comments

f570281

hvanhovell approved these changes Aug 7, 2019

View reviewed changes

hvanhovell closed this in 325bc8e Aug 7, 2019

JkSelf mentioned this pull request Jan 17, 2020

[SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE #27260

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-28583][SQL] Subqueries should not call `onUpdatePlan` in Adaptive Query Execution #25316

[SPARK-28583][SQL] Subqueries should not call `onUpdatePlan` in Adaptive Query Execution #25316

Uh oh!

maryannxue commented Jul 31, 2019

Uh oh!

SparkQA commented Jul 31, 2019

Uh oh!

cloud-fan Aug 1, 2019 •

edited

Loading

Uh oh!

cloud-fan Aug 1, 2019

Uh oh!

hvanhovell Aug 1, 2019

Uh oh!

SparkQA commented Aug 6, 2019

Uh oh!

cloud-fan commented Aug 6, 2019

Uh oh!

hvanhovell left a comment

Uh oh!

hvanhovell commented Aug 7, 2019

Uh oh!

SparkQA commented Aug 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-28583][SQL] Subqueries should not call onUpdatePlan in Adaptive Query Execution #25316

[SPARK-28583][SQL] Subqueries should not call onUpdatePlan in Adaptive Query Execution #25316

Uh oh!

Conversation

maryannxue commented Jul 31, 2019

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jul 31, 2019

Uh oh!

cloud-fan Aug 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

hvanhovell Aug 1, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 6, 2019

Uh oh!

cloud-fan commented Aug 6, 2019

Uh oh!

hvanhovell left a comment

Choose a reason for hiding this comment

Uh oh!

hvanhovell commented Aug 7, 2019

Uh oh!

SparkQA commented Aug 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-28583][SQL] Subqueries should not call `onUpdatePlan` in Adaptive Query Execution #25316

[SPARK-28583][SQL] Subqueries should not call `onUpdatePlan` in Adaptive Query Execution #25316

cloud-fan Aug 1, 2019 •

edited

Loading