Skip to content

Commit 473d786

Browse files
bchochodavies
authored andcommitted
[SPARK-16926] [SQL] Remove partition columns from partition metadata.
## What changes were proposed in this pull request? This removes partition columns from column metadata of partitions to match tables. A change introduced in SPARK-14388 removed partition columns from the column metadata of tables, but not for partitions. This causes TableReader to believe that the schema is different between table and partition, and create an unnecessary conversion object inspector in TableReader. ## How was this patch tested? Existing unit tests. Author: Brian Cho <[email protected]> Closes #14515 from dafrista/partition-columns-metadata.
1 parent edb4573 commit 473d786

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,13 @@ private[hive] case class MetastoreRelation(
161161

162162
val sd = new org.apache.hadoop.hive.metastore.api.StorageDescriptor()
163163
tPartition.setSd(sd)
164-
sd.setCols(catalogTable.schema.map(toHiveColumn).asJava)
164+
165+
// Note: In Hive the schema and partition columns must be disjoint sets
166+
val schema = catalogTable.schema.map(toHiveColumn).filter { c =>
167+
!catalogTable.partitionColumnNames.contains(c.getName)
168+
}
169+
sd.setCols(schema.asJava)
170+
165171
p.storage.locationUri.foreach(sd.setLocation)
166172
p.storage.inputFormat.foreach(sd.setInputFormat)
167173
p.storage.outputFormat.foreach(sd.setOutputFormat)

0 commit comments

Comments
 (0)