Skip to content

Commit 380b099

Browse files
dongjoon-hyunhvanhovell
authored andcommitted
[SPARK-17612][SQL][BRANCH-2.0] Support DESCRIBE table PARTITION SQL syntax
## What changes were proposed in this pull request? This is a backport of SPARK-17612. This implements `DESCRIBE table PARTITION` SQL Syntax again. It was supported until Spark 1.6.2, but was dropped since 2.0.0. **Spark 1.6.2** ```scala scala> sql("CREATE TABLE partitioned_table (a STRING, b INT) PARTITIONED BY (c STRING, d STRING)") res1: org.apache.spark.sql.DataFrame = [result: string] scala> sql("ALTER TABLE partitioned_table ADD PARTITION (c='Us', d=1)") res2: org.apache.spark.sql.DataFrame = [result: string] scala> sql("DESC partitioned_table PARTITION (c='Us', d=1)").show(false) +----------------------------------------------------------------+ |result | +----------------------------------------------------------------+ |a string | |b int | |c string | |d string | | | |# Partition Information | |# col_name data_type comment | | | |c string | |d string | +----------------------------------------------------------------+ ``` **Spark 2.0** - **Before** ```scala scala> sql("CREATE TABLE partitioned_table (a STRING, b INT) PARTITIONED BY (c STRING, d STRING)") res0: org.apache.spark.sql.DataFrame = [] scala> sql("ALTER TABLE partitioned_table ADD PARTITION (c='Us', d=1)") res1: org.apache.spark.sql.DataFrame = [] scala> sql("DESC partitioned_table PARTITION (c='Us', d=1)").show(false) org.apache.spark.sql.catalyst.parser.ParseException: Unsupported SQL statement ``` - **After** ```scala scala> sql("CREATE TABLE partitioned_table (a STRING, b INT) PARTITIONED BY (c STRING, d STRING)") res0: org.apache.spark.sql.DataFrame = [] scala> sql("ALTER TABLE partitioned_table ADD PARTITION (c='Us', d=1)") res1: org.apache.spark.sql.DataFrame = [] scala> sql("DESC partitioned_table PARTITION (c='Us', d=1)").show(false) +-----------------------+---------+-------+ |col_name |data_type|comment| +-----------------------+---------+-------+ |a |string |null | |b |int |null | |c |string |null | |d |string |null | |# Partition Information| | | |# col_name |data_type|comment| |c |string |null | |d |string |null | +-----------------------+---------+-------+ scala> sql("DESC EXTENDED partitioned_table PARTITION (c='Us', d=1)").show(100,false) +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------+ |col_name |data_type|comment| +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------+ |a |string |null | |b |int |null | |c |string |null | |d |string |null | |# Partition Information | | | |# col_name |data_type|comment| |c |string |null | |d |string |null | | | | | |Detailed Partition Information CatalogPartition( Partition Values: [Us, 1] Storage(Location: file:/Users/dhyun/SPARK-17612-DESC-PARTITION/spark-warehouse/partitioned_table/c=Us/d=1, InputFormat: org.apache.hadoop.mapred.TextInputFormat, OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, Serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Properties: [serialization.format=1]) Partition Parameters:{transient_lastDdlTime=1475001066})| | | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------+ scala> sql("DESC FORMATTED partitioned_table PARTITION (c='Us', d=1)").show(100,false) +--------------------------------+---------------------------------------------------------------------------------------+-------+ |col_name |data_type |comment| +--------------------------------+---------------------------------------------------------------------------------------+-------+ |a |string |null | |b |int |null | |c |string |null | |d |string |null | |# Partition Information | | | |# col_name |data_type |comment| |c |string |null | |d |string |null | | | | | |# Detailed Partition Information| | | |Partition Value: |[Us, 1] | | |Database: |default | | |Table: |partitioned_table | | |Location: |file:/Users/dhyun/SPARK-17612-DESC-PARTITION/spark-warehouse/partitioned_table/c=Us/d=1| | |Partition Parameters: | | | | transient_lastDdlTime |1475001066 | | | | | | |# Storage Information | | | |SerDe Library: |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | |InputFormat: |org.apache.hadoop.mapred.TextInputFormat | | |OutputFormat: |org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | | |Compressed: |No | | |Storage Desc Parameters: | | | | serialization.format |1 | | +--------------------------------+---------------------------------------------------------------------------------------+-------+ ``` ## How was this patch tested? Pass the Jenkins tests with a new testcase. Author: Dongjoon Hyun <[email protected]> Closes #15351 from dongjoon-hyun/SPARK-17612-BACK.
1 parent 594a2cf commit 380b099

File tree

6 files changed

+287
-18
lines changed

6 files changed

+287
-18
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,17 @@ case class CatalogColumn(
108108
case class CatalogTablePartition(
109109
spec: CatalogTypes.TablePartitionSpec,
110110
storage: CatalogStorageFormat,
111-
parameters: Map[String, String] = Map.empty)
111+
parameters: Map[String, String] = Map.empty) {
112+
113+
override def toString: String = {
114+
val output =
115+
Seq(
116+
s"Partition Values: [${spec.values.mkString(", ")}]",
117+
s"$storage",
118+
s"Partition Parameters:{${parameters.map(p => p._1 + "=" + p._2).mkString(", ")}}")
119+
output.filter(_.nonEmpty).mkString("CatalogPartition(\n\t", "\n\t", ")")
120+
}
121+
}
112122

113123

114124
/**

sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,13 +278,24 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder {
278278
* Create a [[DescribeTableCommand]] logical plan.
279279
*/
280280
override def visitDescribeTable(ctx: DescribeTableContext): LogicalPlan = withOrigin(ctx) {
281-
// Describe partition and column are not supported yet. Return null and let the parser decide
281+
// Describe column are not supported yet. Return null and let the parser decide
282282
// what to do with this (create an exception or pass it on to a different system).
283-
if (ctx.describeColName != null || ctx.partitionSpec != null) {
283+
if (ctx.describeColName != null) {
284284
null
285285
} else {
286+
val partitionSpec = if (ctx.partitionSpec != null) {
287+
// According to the syntax, visitPartitionSpec returns `Map[String, Option[String]]`.
288+
visitPartitionSpec(ctx.partitionSpec).map {
289+
case (key, Some(value)) => key -> value
290+
case (key, _) =>
291+
throw new ParseException(s"PARTITION specification is incomplete: `$key`", ctx)
292+
}
293+
} else {
294+
Map.empty[String, String]
295+
}
286296
DescribeTableCommand(
287297
visitTableIdentifier(ctx.tableIdentifier),
298+
partitionSpec,
288299
ctx.EXTENDED != null,
289300
ctx.FORMATTED != null)
290301
}

sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala

Lines changed: 69 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ import org.apache.hadoop.fs.Path
2929

3030
import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
3131
import org.apache.spark.sql.catalyst.TableIdentifier
32-
import org.apache.spark.sql.catalyst.catalog.{CatalogColumn, CatalogTable, CatalogTableType}
32+
import org.apache.spark.sql.catalyst.catalog._
3333
import org.apache.spark.sql.catalyst.catalog.CatalogTableType._
3434
import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec
3535
import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference}
@@ -417,10 +417,14 @@ case class TruncateTableCommand(
417417
/**
418418
* Command that looks like
419419
* {{{
420-
* DESCRIBE [EXTENDED|FORMATTED] table_name;
420+
* DESCRIBE [EXTENDED|FORMATTED] table_name partitionSpec?;
421421
* }}}
422422
*/
423-
case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isFormatted: Boolean)
423+
case class DescribeTableCommand(
424+
table: TableIdentifier,
425+
partitionSpec: Map[String, String],
426+
isExtended: Boolean,
427+
isFormatted: Boolean)
424428
extends RunnableCommand {
425429

426430
override val output: Seq[Attribute] = Seq(
@@ -438,6 +442,10 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF
438442
val catalog = sparkSession.sessionState.catalog
439443

440444
if (catalog.isTemporaryTable(table)) {
445+
if (partitionSpec.nonEmpty) {
446+
throw new AnalysisException(
447+
s"DESC PARTITION is not allowed on a temporary view: ${table.identifier}")
448+
}
441449
describeSchema(catalog.lookupRelation(table).schema, result)
442450
} else {
443451
val metadata = catalog.getTableMetadata(table)
@@ -451,12 +459,16 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF
451459
describeSchema(metadata.schema, result)
452460
}
453461

454-
if (isExtended) {
455-
describeExtended(metadata, result)
456-
} else if (isFormatted) {
457-
describeFormatted(metadata, result)
462+
describePartitionInfo(metadata, result)
463+
464+
if (partitionSpec.isEmpty) {
465+
if (isExtended) {
466+
describeExtendedTableInfo(metadata, result)
467+
} else if (isFormatted) {
468+
describeFormattedTableInfo(metadata, result)
469+
}
458470
} else {
459-
describePartitionInfo(metadata, result)
471+
describeDetailedPartitionInfo(catalog, metadata, result)
460472
}
461473
}
462474

@@ -481,16 +493,12 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF
481493
}
482494
}
483495

484-
private def describeExtended(table: CatalogTable, buffer: ArrayBuffer[Row]): Unit = {
485-
describePartitionInfo(table, buffer)
486-
496+
private def describeExtendedTableInfo(table: CatalogTable, buffer: ArrayBuffer[Row]): Unit = {
487497
append(buffer, "", "", "")
488498
append(buffer, "# Detailed Table Information", table.toString, "")
489499
}
490500

491-
private def describeFormatted(table: CatalogTable, buffer: ArrayBuffer[Row]): Unit = {
492-
describePartitionInfo(table, buffer)
493-
501+
private def describeFormattedTableInfo(table: CatalogTable, buffer: ArrayBuffer[Row]): Unit = {
494502
append(buffer, "", "", "")
495503
append(buffer, "# Detailed Table Information", "", "")
496504
append(buffer, "Database:", table.database, "")
@@ -548,6 +556,53 @@ case class DescribeTableCommand(table: TableIdentifier, isExtended: Boolean, isF
548556
}
549557
}
550558

559+
private def describeDetailedPartitionInfo(
560+
catalog: SessionCatalog,
561+
metadata: CatalogTable,
562+
result: ArrayBuffer[Row]): Unit = {
563+
if (metadata.tableType == CatalogTableType.VIEW) {
564+
throw new AnalysisException(
565+
s"DESC PARTITION is not allowed on a view: ${table.identifier}")
566+
}
567+
if (DDLUtils.isDatasourceTable(metadata)) {
568+
throw new AnalysisException(
569+
s"DESC PARTITION is not allowed on a datasource table: ${table.identifier}")
570+
}
571+
val partition = catalog.getPartition(table, partitionSpec)
572+
if (isExtended) {
573+
describeExtendedDetailedPartitionInfo(table, metadata, partition, result)
574+
} else if (isFormatted) {
575+
describeFormattedDetailedPartitionInfo(table, metadata, partition, result)
576+
describeStorageInfo(metadata, result)
577+
}
578+
}
579+
580+
private def describeExtendedDetailedPartitionInfo(
581+
tableIdentifier: TableIdentifier,
582+
table: CatalogTable,
583+
partition: CatalogTablePartition,
584+
buffer: ArrayBuffer[Row]): Unit = {
585+
append(buffer, "", "", "")
586+
append(buffer, "Detailed Partition Information " + partition.toString, "", "")
587+
}
588+
589+
private def describeFormattedDetailedPartitionInfo(
590+
tableIdentifier: TableIdentifier,
591+
table: CatalogTable,
592+
partition: CatalogTablePartition,
593+
buffer: ArrayBuffer[Row]): Unit = {
594+
append(buffer, "", "", "")
595+
append(buffer, "# Detailed Partition Information", "", "")
596+
append(buffer, "Partition Value:", s"[${partition.spec.values.mkString(", ")}]", "")
597+
append(buffer, "Database:", table.database, "")
598+
append(buffer, "Table:", tableIdentifier.table, "")
599+
append(buffer, "Location:", partition.storage.locationUri.getOrElse(""), "")
600+
append(buffer, "Partition Parameters:", "", "")
601+
partition.parameters.foreach { case (key, value) =>
602+
append(buffer, s" $key", value, "")
603+
}
604+
}
605+
551606
private def describeSchema(schema: Seq[CatalogColumn], buffer: ArrayBuffer[Row]): Unit = {
552607
schema.foreach { column =>
553608
append(buffer, column.name, column.dataType.toLowerCase, column.comment.orNull)
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
CREATE TABLE t (a STRING, b INT) PARTITIONED BY (c STRING, d STRING);
2+
3+
ALTER TABLE t ADD PARTITION (c='Us', d=1);
4+
5+
DESC t;
6+
7+
-- Ignore these because there exist timestamp results, e.g., `Create Table`.
8+
-- DESC EXTENDED t;
9+
-- DESC FORMATTED t;
10+
11+
DESC t PARTITION (c='Us', d=1);
12+
13+
-- Ignore these because there exist timestamp results, e.g., transient_lastDdlTime.
14+
-- DESC EXTENDED t PARTITION (c='Us', d=1);
15+
-- DESC FORMATTED t PARTITION (c='Us', d=1);
16+
17+
-- NoSuchPartitionException: Partition not found in table
18+
DESC t PARTITION (c='Us', d=2);
19+
20+
-- AnalysisException: Partition spec is invalid
21+
DESC t PARTITION (c='Us');
22+
23+
-- ParseException: PARTITION specification is incomplete
24+
DESC t PARTITION (c='Us', d);
25+
26+
-- DROP TEST TABLE
27+
DROP TABLE t;
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
-- Automatically generated by SQLQueryTestSuite
2+
-- Number of queries: 8
3+
4+
5+
-- !query 0
6+
CREATE TABLE t (a STRING, b INT) PARTITIONED BY (c STRING, d STRING)
7+
-- !query 0 schema
8+
struct<>
9+
-- !query 0 output
10+
11+
12+
13+
-- !query 1
14+
ALTER TABLE t ADD PARTITION (c='Us', d=1)
15+
-- !query 1 schema
16+
struct<>
17+
-- !query 1 output
18+
19+
20+
21+
-- !query 2
22+
DESC t
23+
-- !query 2 schema
24+
struct<col_name:string,data_type:string,comment:string>
25+
-- !query 2 output
26+
# Partition Information
27+
# col_name data_type comment
28+
a string
29+
b int
30+
c string
31+
c string
32+
d string
33+
d string
34+
35+
36+
-- !query 3
37+
DESC t PARTITION (c='Us', d=1)
38+
-- !query 3 schema
39+
struct<col_name:string,data_type:string,comment:string>
40+
-- !query 3 output
41+
# Partition Information
42+
# col_name data_type comment
43+
a string
44+
b int
45+
c string
46+
c string
47+
d string
48+
d string
49+
50+
51+
-- !query 4
52+
DESC t PARTITION (c='Us', d=2)
53+
-- !query 4 schema
54+
struct<>
55+
-- !query 4 output
56+
org.apache.spark.sql.catalyst.analysis.NoSuchPartitionException
57+
Partition not found in table 't' database 'default':
58+
c -> Us
59+
d -> 2;
60+
61+
62+
-- !query 5
63+
DESC t PARTITION (c='Us')
64+
-- !query 5 schema
65+
struct<>
66+
-- !query 5 output
67+
org.apache.spark.sql.AnalysisException
68+
Partition spec is invalid. The spec (c) must match the partition spec (c, d) defined in table '`default`.`t`';
69+
70+
71+
-- !query 6
72+
DESC t PARTITION (c='Us', d)
73+
-- !query 6 schema
74+
struct<>
75+
-- !query 6 output
76+
org.apache.spark.sql.catalyst.parser.ParseException
77+
78+
PARTITION specification is incomplete: `d`(line 1, pos 0)
79+
80+
== SQL ==
81+
DESC t PARTITION (c='Us', d)
82+
^^^
83+
84+
85+
-- !query 7
86+
DROP TABLE t
87+
-- !query 7 schema
88+
struct<>
89+
-- !query 7 output
90+

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ import org.apache.hadoop.fs.Path
2626

2727
import org.apache.spark.sql._
2828
import org.apache.spark.sql.catalyst.TableIdentifier
29-
import org.apache.spark.sql.catalyst.analysis.{EliminateSubqueryAliases, FunctionRegistry}
29+
import org.apache.spark.sql.catalyst.analysis.{EliminateSubqueryAliases, FunctionRegistry,
30+
NoSuchPartitionException}
3031
import org.apache.spark.sql.catalyst.catalog.CatalogTableType
3132
import org.apache.spark.sql.catalyst.parser.ParseException
3233
import org.apache.spark.sql.execution.command.CreateDataSourceTableUtils
@@ -342,6 +343,81 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton {
342343
}
343344
}
344345

346+
test("describe partition") {
347+
withTable("partitioned_table") {
348+
sql("CREATE TABLE partitioned_table (a STRING, b INT) PARTITIONED BY (c STRING, d STRING)")
349+
sql("ALTER TABLE partitioned_table ADD PARTITION (c='Us', d=1)")
350+
351+
checkKeywordsExist(sql("DESC partitioned_table PARTITION (c='Us', d=1)"),
352+
"# Partition Information",
353+
"# col_name")
354+
355+
checkKeywordsExist(sql("DESC EXTENDED partitioned_table PARTITION (c='Us', d=1)"),
356+
"# Partition Information",
357+
"# col_name",
358+
"Detailed Partition Information CatalogPartition(",
359+
"Partition Values: [Us, 1]",
360+
"Storage(Location:",
361+
"Partition Parameters")
362+
363+
checkKeywordsExist(sql("DESC FORMATTED partitioned_table PARTITION (c='Us', d=1)"),
364+
"# Partition Information",
365+
"# col_name",
366+
"# Detailed Partition Information",
367+
"Partition Value:",
368+
"Database:",
369+
"Table:",
370+
"Location:",
371+
"Partition Parameters:",
372+
"# Storage Information")
373+
}
374+
}
375+
376+
test("describe partition - error handling") {
377+
withTable("partitioned_table", "datasource_table") {
378+
sql("CREATE TABLE partitioned_table (a STRING, b INT) PARTITIONED BY (c STRING, d STRING)")
379+
sql("ALTER TABLE partitioned_table ADD PARTITION (c='Us', d=1)")
380+
381+
val m = intercept[NoSuchPartitionException] {
382+
sql("DESC partitioned_table PARTITION (c='Us', d=2)")
383+
}.getMessage()
384+
assert(m.contains("Partition not found in table"))
385+
386+
val m2 = intercept[AnalysisException] {
387+
sql("DESC partitioned_table PARTITION (c='Us')")
388+
}.getMessage()
389+
assert(m2.contains("Partition spec is invalid"))
390+
391+
val m3 = intercept[ParseException] {
392+
sql("DESC partitioned_table PARTITION (c='Us', d)")
393+
}.getMessage()
394+
assert(m3.contains("PARTITION specification is incomplete: `d`"))
395+
396+
spark
397+
.range(1).select('id as 'a, 'id as 'b, 'id as 'c, 'id as 'd).write
398+
.partitionBy("d")
399+
.saveAsTable("datasource_table")
400+
val m4 = intercept[AnalysisException] {
401+
sql("DESC datasource_table PARTITION (d=2)")
402+
}.getMessage()
403+
assert(m4.contains("DESC PARTITION is not allowed on a datasource table"))
404+
405+
val m5 = intercept[AnalysisException] {
406+
spark.range(10).select('id as 'a, 'id as 'b).createTempView("view1")
407+
sql("DESC view1 PARTITION (c='Us', d=1)")
408+
}.getMessage()
409+
assert(m5.contains("DESC PARTITION is not allowed on a temporary view"))
410+
411+
withView("permanent_view") {
412+
val m = intercept[AnalysisException] {
413+
sql("CREATE VIEW permanent_view AS SELECT * FROM partitioned_table")
414+
sql("DESC permanent_view PARTITION (c='Us', d=1)")
415+
}.getMessage()
416+
assert(m.contains("DESC PARTITION is not allowed on a view"))
417+
}
418+
}
419+
}
420+
345421
test("SPARK-5371: union with null and sum") {
346422
val df = Seq((1, 1)).toDF("c1", "c2")
347423
df.createOrReplaceTempView("table1")

0 commit comments

Comments
 (0)