Skip to content

Conversation

@cfmcgrady
Copy link
Contributor

What changes were proposed in this pull request?

An empty dataframe is saved with partitions should write a metadata only file, make this behavior the same with the non-partitioned dataframe.(see PR-20525)

// create an empty DF with schema
val inputDF = Seq(
  ("value1", "value2", "partition1"),
  ("value3", "value4", "partition2"))
  .toDF("some_column_1", "some_column_2", "some_partition_column_1")
  .where("1==2")

// write dataframe into partitions
inputDF.write
  .partitionBy("some_partition_column_1")
  .mode(SaveMode.Overwrite)
  .parquet("/tmp/parquet/t1")


// Read dataframe
val readDF = spark.read.parquet("/tmp/parquet/t1")
readDF.printSchema()

Before this PR, an AnalysisException will throw.

 [SPARK-35592●●] >tree /tmp/parquet/t1
/tmp/parquet/t1
└── _SUCCESS

0 directories, 1 file

After this PR

 [SPARK-35592●●] >tree /tmp/parquet/t1
/tmp/parquet/t1
├── _SUCCESS
└── some_partition_column_1=__HIVE_DEFAULT_PARTITION__
    └── part-00000-2a29f11e-64fb-450d-8916-91ccac53476c.c000.snappy.parquet

1 directory, 2 files
root
 |-- some_column_1: string (nullable = true)
 |-- some_column_2: string (nullable = true)
 |-- some_partition_column_1: null (nullable = true)

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New tests.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@cfmcgrady
Copy link
Contributor Author

cfmcgrady commented Jun 9, 2021

Canceling since a higher priority waiting request for 'apache/spark-refs/heads/master-On pull requests' exists

Hi, @HyukjinKwon . I don't know why the checks were canceled, what can I do to enable the check actions? Thank you for helping me.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Sep 18, 2021
@github-actions github-actions bot closed this Sep 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants