Skip to content

Commit f87639e

Browse files
committed
[SPARK-274034][SQL] Table Statisics shall be updated automatically if auto update feature is enabled(spark.sql.statistics.size.autoUpdate.enabled =true)
What changes were proposed in this pull request? For the table, INSERT OVERWRITE command statistics are automatically computed by default if user set spark.sql.statistics.size.autoUpdate.enabled =true and the statistics shall be recorded in metadata store, this is not happening currently because of validation table.stats.nonEmpty, the statistics were never recorded for the newly created table, this check doesn't holds good if auto update property feature is enabled by the user. As part of fix the autoSizeUpdateEnabled has been pulled up as part of separate validation which will ensure if this feature is enabled the system will calculate the size of the table in every insert command and the same will be recorded in meta-store. How was this patch tested? UT is written and manually verified in cluster. Tested with unit tests + some internal tests on real cluster.
1 parent b143e84 commit f87639e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,7 @@ class StatisticsCollectionSuite extends StatisticsCollectionTestBase with Shared
342342
val autoUpdate = true
343343
withSQLConf(SQLConf.AUTO_SIZE_UPDATE_ENABLED.key -> autoUpdate.toString) {
344344
withTable(table) {
345-
sql(s"CREATE TABLE $table (i int, j string) STORED AS PARQUET")
345+
sql(s"CREATE TABLE $table (i int, j string) USING PARQUET")
346346
// analyze to get initial stats
347347
// insert into command
348348
sql(s"INSERT INTO TABLE $table SELECT 1, 'abc'")

0 commit comments

Comments
 (0)