[SPARK-35592][SQL] An empty dataframe is saved with partitions should write a metadata only file #32794

cfmcgrady · 2021-06-06T03:56:45Z

What changes were proposed in this pull request?

An empty dataframe is saved with partitions should write a metadata only file, make this behavior the same with the non-partitioned dataframe.(see PR-20525)

// create an empty DF with schema
val inputDF = Seq(
  ("value1", "value2", "partition1"),
  ("value3", "value4", "partition2"))
  .toDF("some_column_1", "some_column_2", "some_partition_column_1")
  .where("1==2")

// write dataframe into partitions
inputDF.write
  .partitionBy("some_partition_column_1")
  .mode(SaveMode.Overwrite)
  .parquet("/tmp/parquet/t1")


// Read dataframe
val readDF = spark.read.parquet("/tmp/parquet/t1")

Before this PR, an AnalysisException will throw.

 [SPARK-35592●●] >tree /tmp/parquet/t1
/tmp/parquet/t1
└── _SUCCESS

0 directories, 1 file

After this PR

 [SPARK-35592●●] >tree /tmp/parquet/t1
/tmp/parquet/t1
├── _SUCCESS
└── some_partition_column_1=__HIVE_DEFAULT_PARTITION__
    └── part-00000-2a29f11e-64fb-450d-8916-91ccac53476c.c000.snappy.parquet

1 directory, 2 files

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New tests.

… write a metadata only file

AmplabJenkins · 2021-06-06T04:13:50Z

Can one of the admins verify this patch?

[SPARK-35592][SQL] An empty dataframe is saved with partitions should…

85b8618

… write a metadata only file

github-actions bot added the SQL label Jun 6, 2021

cfmcgrady closed this Jun 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-35592][SQL] An empty dataframe is saved with partitions should write a metadata only file #32794

[SPARK-35592][SQL] An empty dataframe is saved with partitions should write a metadata only file #32794

Uh oh!

cfmcgrady commented Jun 6, 2021

Uh oh!

AmplabJenkins commented Jun 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-35592][SQL] An empty dataframe is saved with partitions should write a metadata only file #32794

[SPARK-35592][SQL] An empty dataframe is saved with partitions should write a metadata only file #32794

Uh oh!

Conversation

cfmcgrady commented Jun 6, 2021

What changes were proposed in this pull request?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

AmplabJenkins commented Jun 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants