Commit a8c08b1
[SPARK-31187][SQL] Sort the whole-stage codegen debug output by codegenStageId
### What changes were proposed in this pull request?
Spark SQL's whole-stage codegen (WSCG) supports dumping the generated code to help with debugging. One way to get the generated code is through `df.queryExecution.debug.codegen`, or SQL `EXPLAIN CODEGEN` statement.
The generated code is currently printed without specific ordering, which can make debugging a bit annoying. This PR makes a minor improvement to sort the codegen dump by the `codegenStageId`, ascending.
After this change, the following query:
```scala
spark.range(10).agg(sum('id)).queryExecution.debug.codegen
```
will always dump the generated code in a natural, stable order. A version of this example with shorter output is:
```
spark.range(10).agg(sum('id)).queryExecution.debug.codegenToSeq.map(_._1).foreach(println)
*(1) HashAggregate(keys=[], functions=[partial_sum(id#8L)], output=[sum#15L])
+- *(1) Range (0, 10, step=1, splits=16)
*(2) HashAggregate(keys=[], functions=[sum(id#8L)], output=[sum(id)#12L])
+- Exchange SinglePartition, true, [id=#30]
+- *(1) HashAggregate(keys=[], functions=[partial_sum(id#8L)], output=[sum#15L])
+- *(1) Range (0, 10, step=1, splits=16)
```
The number of codegen stages within a single SQL query tends to be very small, most likely < 50, so the overhead of adding the sorting shouldn't be significant.
### Why are the changes needed?
Minor improvement to aid WSCG debugging.
### Does this PR introduce any user-facing change?
No user-facing change for end-users; minor change for developers who debug WSCG generated code.
### How was this patch tested?
Manually tested the output; all other tests still pass.
Closes #27955 from rednaxelafx/codegen.
Authored-by: Kris Mok <[email protected]>
Signed-off-by: Takeshi Yamamuro <[email protected]>
(cherry picked from commit a177628)
Signed-off-by: Takeshi Yamamuro <[email protected]>1 parent d712a7a commit a8c08b1
File tree
1 file changed
+1
-1
lines changed- sql/core/src/main/scala/org/apache/spark/sql/execution/debug
1 file changed
+1
-1
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
116 | | - | |
| 116 | + | |
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
| |||
0 commit comments