Skip to content

Commit 8efb710

Browse files
cloud-fandongjoon-hyun
authored andcommitted
[SPARK-31091] Revert SPARK-24640 Return NULL from size(NULL) by default
### What changes were proposed in this pull request? This PR reverts #26051 and #26066 ### Why are the changes needed? There is no standard requiring that `size(null)` must return null, and returning -1 looks reasonable as well. This is kind of a cosmetic change and we should avoid it if it breaks existing queries. This is similar to reverting TRIM function parameter order change. ### Does this PR introduce any user-facing change? Yes, change the behavior of `size(null)` back to be the same as 2.4. ### How was this patch tested? N/A Closes #27834 from cloud-fan/revert. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 5be0d04 commit 8efb710

File tree

3 files changed

+3
-5
lines changed

3 files changed

+3
-5
lines changed

docs/sql-migration-guide.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -217,8 +217,6 @@ license: |
217217
- `now` - current query start time
218218
For example `SELECT timestamp 'tomorrow';`.
219219

220-
- Since Spark 3.0, the `size` function returns `NULL` for the `NULL` input. In Spark version 2.4 and earlier, this function gives `-1` for the same input. To restore the behavior before Spark 3.0, you can set `spark.sql.legacy.sizeOfNull` to `true`.
221-
222220
- Since Spark 3.0, when the `array`/`map` function is called without any parameters, it returns an empty collection with `NullType` as element type. In Spark version 2.4 and earlier, it returns an empty collection with `StringType` as element type. To restore the behavior before Spark 3.0, you can set `spark.sql.legacy.createEmptyCollectionUsingStringType` to `true`.
223221

224222
- Since Spark 3.0, the interval literal syntax does not allow multiple from-to units anymore. For example, `SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' YEAR TO MONTH'` throws parser exception.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ trait BinaryArrayExpressionWithImplicitCast extends BinaryExpression
7979
_FUNC_(expr) - Returns the size of an array or a map.
8080
The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true.
8181
If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input.
82-
By default, the spark.sql.legacy.sizeOfNull parameter is set to false.
82+
By default, the spark.sql.legacy.sizeOfNull parameter is set to true.
8383
""",
8484
examples = """
8585
Examples:
@@ -88,7 +88,7 @@ trait BinaryArrayExpressionWithImplicitCast extends BinaryExpression
8888
> SELECT _FUNC_(map('a', 1, 'b', 2));
8989
2
9090
> SELECT _FUNC_(NULL);
91-
NULL
91+
-1
9292
""")
9393
case class Size(child: Expression, legacySizeOfNull: Boolean)
9494
extends UnaryExpression with ExpectsInputTypes {

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2157,7 +2157,7 @@ object SQLConf {
21572157
"The size function returns null for null input if the flag is disabled.")
21582158
.version("2.4.0")
21592159
.booleanConf
2160-
.createWithDefault(false)
2160+
.createWithDefault(true)
21612161

21622162
val LEGACY_REPLACE_DATABRICKS_SPARK_AVRO_ENABLED =
21632163
buildConf("spark.sql.legacy.replaceDatabricksSparkAvro.enabled")

0 commit comments

Comments
 (0)