Skip to content

Commit 33c8a49

Browse files
gatorsmilemarmbrus
authored andcommitted
[SPARK-12989][SQL] Delaying Alias Cleanup after ExtractWindowExpressions
JIRA: https://issues.apache.org/jira/browse/SPARK-12989 In the rule `ExtractWindowExpressions`, we simply replace alias by the corresponding attribute. However, this will cause an issue exposed by the following case: ```scala val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", "num") .withColumn("Data", struct("A", "B", "C")) .drop("A") .drop("B") .drop("C") val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc) data.select($"*", max("num").over(winSpec) as "max").explain(true) ``` In this case, both `Data.A` and `Data.B` are `alias` in `WindowSpecDefinition`. If we replace these alias expression by their alias names, we are unable to know what they are since they will not be put in `missingExpr` too. Author: gatorsmile <[email protected]> Author: xiaoli <[email protected]> Author: Xiao Li <[email protected]> Closes #10963 from gatorsmile/seletStarAfterColDrop.
1 parent 6075573 commit 33c8a49

File tree

2 files changed

+13
-2
lines changed

2 files changed

+13
-2
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -883,12 +883,13 @@ class Analyzer(
883883
if (missingExpr.nonEmpty) {
884884
extractedExprBuffer += ne
885885
}
886-
ne.toAttribute
886+
// alias will be cleaned in the rule CleanupAliases
887+
ne
887888
case e: Expression if e.foldable =>
888889
e // No need to create an attribute reference if it will be evaluated as a Literal.
889890
case e: Expression =>
890891
// For other expressions, we extract it and replace it with an AttributeReference (with
891-
// an interal column name, e.g. "_w0").
892+
// an internal column name, e.g. "_w0").
892893
val withName = Alias(e, s"_w${extractedExprBuffer.length}")()
893894
extractedExprBuffer += withName
894895
withName.toAttribute

sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowSuite.scala

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -344,4 +344,14 @@ class DataFrameWindowSuite extends QueryTest with SharedSQLContext {
344344
Row("b", 1, null, null, null, null, null, null),
345345
Row("b", 2, null, null, null, null, null, null)))
346346
}
347+
348+
test("SPARK-12989 ExtractWindowExpressions treats alias as regular attribute") {
349+
val src = Seq((0, 3, 5)).toDF("a", "b", "c")
350+
.withColumn("Data", struct("a", "b"))
351+
.drop("a")
352+
.drop("b")
353+
val winSpec = Window.partitionBy("Data.a", "Data.b").orderBy($"c".desc)
354+
val df = src.select($"*", max("c").over(winSpec) as "max")
355+
checkAnswer(df, Row(5, Row(0, 3), 5))
356+
}
347357
}

0 commit comments

Comments
 (0)