Skip to content

Conversation

@linhongliu-db
Copy link
Contributor

@linhongliu-db linhongliu-db commented Dec 29, 2022

What changes were proposed in this pull request?

This PR proposes to group all sub-executions together in SQL UI if they belong to the same root execution.

This feature is controlled by conf spark.ui.sql.groupSubExecutionEnabled and the default value is set to true

We can have some follow-up improvements after this PR:

  1. Add links to SQL page and Job page to indicate the root execution ID.
  2. Better handling for the root execution missing case (e.g. eviction due to retaining limit). In this PR, the sub-executions will be displayed ungrouped.

Why are the changes needed?

better user experience.

In PR #39220, the CTAS query will trigger a sub-execution to perform the data insertion. But the current UI will display the two executions separately which may confuse the users.
In addition, this change should also help the structured streaming cases

Does this PR introduce any user-facing change?

Yes, the screenshot of the UI change is shown below
SQL Query:

CREATE TABLE t USING PARQUET AS SELECT 'a' as a, 1 as b

UI before this PR
Screen Shot 2022-12-28 at 4 42 08 PM

UI after this PR with sub executions collapsed
Screen Shot 2022-12-28 at 4 44 32 PM

UI after this PR with sub execution expanded
Screen Shot 2022-12-28 at 4 44 41 PM

How was this patch tested?

UT

@linhongliu-db
Copy link
Contributor Author

cc @cloud-fan @HeartSaVioR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executionIdToSubExecutions(e.rootExecutionId) += e

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@cloud-fan
Copy link
Contributor

also cc @ulysses-you

@linhongliu-db
Copy link
Contributor Author

The test failed at python linter. should be caused by some "connect" PRs.

annotations failed mypy checks:
python/pyspark/sql/connect/client.py:25: error: Skipping analyzing "grpc_status": module is installed, but missing library stubs or py.typed marker  [import]
python/pyspark/sql/connect/client.py:25: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
python/pyspark/sql/connect/client.py:30: error: Skipping analyzing "google.rpc": module is installed, but missing library stubs or py.typed marker  [import]
Found 2 errors in 1 file (checked 381 source files)

@cloud-fan
Copy link
Contributor

@zhengruifeng @HyukjinKwon are you aware of anything about this python failure?

@zhengruifeng
Copy link
Contributor

@cloud-fan I can not repro this fail in my local env

the latest mypy check in master also succeed. https://github.com/apache/spark/actions/runs/3804744443/jobs/6472202104

@cloud-fan
Copy link
Contributor

@linhongliu-db can you rebase your branch and try again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces 4 config namespace groups like the following. Shall we simplify the config namespace?

spark.ui.sql.*
spark.ui.sql.group.*
spark.ui.sql.group.sub.*
spark.ui.sql.group.sub.execution.*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to spark.ui.sql.groupSubExecutionEnabled but I'm glad to take any other naming suggestions. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any other usage for this methods, setRootExecutionId and unsetRootExecutionId? These methods seem to be used once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's only used once. I personally think that using a function can better explain the logic since it's not a no-brainer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question. Do we have a test coverage for only Spark event logs to validate this code path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This <tr> doesn't need additional indentation here. Could you align the indentation with the previous <tr> at line 389?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@linhongliu-db
Copy link
Contributor Author

working on the comments

/**
* Unset the "root" SQL Execution Id once the "root" SQL execution completes.
*/
private def unsetRootExecutionId(sc: SparkContext, executionId: String): Unit = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is also misleading because we set EXECUTION_ROOT_ID_KEY to null only when it's equal to executionId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after a second thought, this function wrapper is misleading and doesn't make things clear. So I inlined it to the main function. Thanks for the suggestion.

} finally {
executionIdToQueryExecution.remove(executionId)
sc.setLocalProperty(EXECUTION_ID_KEY, oldExecutionId)
unsetRootExecutionId(sc, executionId.toString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to define a new method, shall we define it to accept Long directly?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updates, @linhongliu-db . I have only minor comments.

cc @gengliangwang , too

case class SparkListenerSQLExecutionStart(
executionId: Long,
// if the execution is a root, then rootExecutionId == executionId
rootExecutionId: Long,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


new SQLExecutionUIData(
executionId = ui.getExecutionId,
rootExecutionId = ui.getExecutionId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ui.getRootExecutionId after updating the protobuf definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


new SQLExecutionUIData(
executionId = 0,
rootExecutionId = 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing purpose, let's use a different value from executionId

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@dongjoon-hyun
Copy link
Member

Could you update your PR, @linhongliu-db ? We have Apache Spark 3.4 Feature Freeze schedule.

Also, cc @xinrong-meng as Apache Spark 3.4. release manager.

@linhongliu-db
Copy link
Contributor Author

@dongjoon-hyun working on it

@cloud-fan
Copy link
Contributor

The failed YarnClusterSuite is definitely unrelated. I'm merging it to master, thanks!

@cloud-fan cloud-fan closed this in c124037 Jan 10, 2023
@dongjoon-hyun
Copy link
Member

Thank you, @linhongliu-db and @cloud-fan .

@linhongliu-db
Copy link
Contributor Author

Thank you everyone for reviewing this!

@linhongliu-db linhongliu-db deleted the SPARK-41752 branch January 10, 2023 18:26
gengliangwang added a commit that referenced this pull request Jan 11, 2023
…nUIData

### What changes were proposed in this pull request?

The new field `rootExecutionId` of `SQLExecutionUIData` is not correctly serialized/deserialized in #39268. This PR is to fix it.

### Why are the changes needed?

Bug fix

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes #39489 from gengliangwang/SPARK-41752.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Mar 14, 2023
… execution

### What changes were proposed in this pull request?
#39268 / [SPARK-41752](https://issues.apache.org/jira/browse/SPARK-41752) added a new non-optional `rootExecutionId: Long` field to the SparkListenerSQLExecutionStart case class.

When JsonProtocol deserializes this event it uses the "ignore missing properties" Jackson deserialization option, causing the rootExecutionField to be initialized with a default value of 0.

The value 0 is a legitimate execution ID, so in the deserialized event we have no ability to distinguish between the absence of a value and a case where all queries have the first query as the root.

Thanks JoshRosen for reporting and investigating this issue.

### Why are the changes needed?
Bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT

Closes #40403 from linhongliu-db/fix-nested-execution.

Authored-by: Linhong Liu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Mar 14, 2023
… execution

### What changes were proposed in this pull request?
#39268 / [SPARK-41752](https://issues.apache.org/jira/browse/SPARK-41752) added a new non-optional `rootExecutionId: Long` field to the SparkListenerSQLExecutionStart case class.

When JsonProtocol deserializes this event it uses the "ignore missing properties" Jackson deserialization option, causing the rootExecutionField to be initialized with a default value of 0.

The value 0 is a legitimate execution ID, so in the deserialized event we have no ability to distinguish between the absence of a value and a case where all queries have the first query as the root.

Thanks JoshRosen for reporting and investigating this issue.

### Why are the changes needed?
Bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT

Closes #40403 from linhongliu-db/fix-nested-execution.

Authored-by: Linhong Liu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 4db8e7b)
Signed-off-by: Dongjoon Hyun <[email protected]>
a0x8o added a commit to a0x8o/spark that referenced this pull request Mar 14, 2023
… execution

### What changes were proposed in this pull request?
apache/spark#39268 / [SPARK-41752](https://issues.apache.org/jira/browse/SPARK-41752) added a new non-optional `rootExecutionId: Long` field to the SparkListenerSQLExecutionStart case class.

When JsonProtocol deserializes this event it uses the "ignore missing properties" Jackson deserialization option, causing the rootExecutionField to be initialized with a default value of 0.

The value 0 is a legitimate execution ID, so in the deserialized event we have no ability to distinguish between the absence of a value and a case where all queries have the first query as the root.

Thanks JoshRosen for reporting and investigating this issue.

### Why are the changes needed?
Bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT

Closes #40403 from linhongliu-db/fix-nested-execution.

Authored-by: Linhong Liu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
… execution

### What changes were proposed in this pull request?
apache#39268 / [SPARK-41752](https://issues.apache.org/jira/browse/SPARK-41752) added a new non-optional `rootExecutionId: Long` field to the SparkListenerSQLExecutionStart case class.

When JsonProtocol deserializes this event it uses the "ignore missing properties" Jackson deserialization option, causing the rootExecutionField to be initialized with a default value of 0.

The value 0 is a legitimate execution ID, so in the deserialized event we have no ability to distinguish between the absence of a value and a case where all queries have the first query as the root.

Thanks JoshRosen for reporting and investigating this issue.

### Why are the changes needed?
Bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT

Closes apache#40403 from linhongliu-db/fix-nested-execution.

Authored-by: Linhong Liu <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 4db8e7b)
Signed-off-by: Dongjoon Hyun <[email protected]>
@wangyum
Copy link
Member

wangyum commented Jul 17, 2023

@linhongliu-db It seems this patch makes CTAS missing the child info on UI: https://issues.apache.org/jira/browse/SPARK-44213

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants