From e3f523bd6f1b87e1fa44cd1a3c561ebd31716f1e Mon Sep 17 00:00:00 2001 From: Wenchen Fan Date: Sat, 23 Dec 2023 00:08:33 +0800 Subject: [PATCH] [SPARK-46485][SQL] V1Write should not add Sort when not needed ### What changes were proposed in this pull request? In `V1Writes`, we try to avoid adding Sort if the output ordering always satisfies. However, the code is completely broken with two issues: - we put `SortOrder` as the child of another `SortOrder` and compare, which always returns false. - once we add a project to do `empty2null`, we change the query output attribute id and the sort order never matches. It's not a big issue as we still have QO rules to eliminate useless sorts, but https://github.com/apache/spark/pull/44429 exposes this problem because the way we optimize sort is a bit different. For `V1Writes`, we should always avoid adding sort even if the number of ordering key is less, to not change the user query. ### Why are the changes needed? fix code mistakes. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? updated test ### Was this patch authored or co-authored using generative AI tooling? no Closes #44458 from cloud-fan/sort. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/execution/datasources/V1Writes.scala | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala index b1d2588ede627..d7a8d7aec0b7b 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala @@ -102,8 +102,8 @@ object V1Writes extends Rule[LogicalPlan] with SQLConfHelper { val requiredOrdering = write.requiredOrdering.map(_.transform { case a: Attribute => attrMap.getOrElse(a, a) }.asInstanceOf[SortOrder]) - val outputOrdering = query.outputOrdering - val orderingMatched = isOrderingMatched(requiredOrdering, outputOrdering) + val outputOrdering = empty2NullPlan.outputOrdering + val orderingMatched = isOrderingMatched(requiredOrdering.map(_.child), outputOrdering) if (orderingMatched) { empty2NullPlan } else {