-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-14362] [SPARK-14406] [SQL] DDL Native Support: Drop View and Drop Table #12146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Sure, will do it. |
|
Test build #55392 has finished for PR 12146 at commit
|
|
@yhuai Actually, when we drop the external table, we do delete the data. If we do not want to delete the data, we need to change the following value from Update: Sorry, my case is wrong. create external table $tabName(c1 int COMMENT 'abc', c2 string)
stored as parquet
location '$tmpDir'
as select 1, '3'Although we use the |
|
Test build #55424 has finished for PR 12146 at commit
|
|
Test build #55431 has finished for PR 12146 at commit
|
|
That I am not sure I understand what you mean by |
|
It sounds like Hive ignores and overwrites the type we provided when we attempt to create the table and converts it to You can see I did the table type verification in both test cases. |
| |stored as parquet | ||
| |location '$tmpDir' | ||
| |as select 1, '3' | ||
| """.stripMargin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The table type is silently changed to MANAGED_TABLE at https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L1315-L1326.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an external table, table type needs to be EXTERNAL_TABLE and its table properties should have an entry EXTERNAL set to TRUE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there is a bug in HiveClientImpl.toHiveTable. I will fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, for this test case, I am not sure Hive allows you to provide a column list in a CTAS table. Also, I am not sure if Hive allows you to provide EXTERNAL keyword for a table generated by CTAS query. Can you double check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for sharing this with me!
hive> create external table tab1(c1 int, c2 string) stored as parquet as select 1, 3;
FAILED: SemanticException [Error 10065]: CREATE TABLE AS SELECT command cannot specify the list of columns for the target table
hive> create external table tab1 stored as parquet as select 1, 3;
FAILED: SemanticException [Error 10070]: CREATE-TABLE-AS-SELECT cannot create external table
Obviously, Spark SQL is different from Hive. Should we change the behavior to match Hive SQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SQL in the test case is processed by Spark SQL. It is not a native Hive SQL command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have been supporting it for a long time (maybe by accident). I have created https://issues.apache.org/jira/browse/SPARK-14507 for deciding if we should keep it or drop the support of it.
For this test, can we change to a real managed table (drop the external keyword from the statement)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do it. Thanks!
BTW, it does not work even if without the external keyword. I will also remove the column list.
hive> create table tab11(c1 int, c2 string) stored as parquet as select 1, 3;
FAILED: SemanticException [Error 10065]: CREATE TABLE AS SELECT command cannot specify the list of columns for the target table|
Test build #55432 has finished for PR 12146 at commit
|
…g Parsing #### What changes were proposed in this pull request? "Not good to slightly ignore all the un-supported options/clauses. We should either support it or throw an exception." A comment from yhuai in another PR #12146 - Can `Explain` be an exception? The `Formatted` clause is used in `HiveCompatibilitySuite`. - Two unsupported clauses in `Drop Table` are handled in a separate PR: #12146 #### How was this patch tested? Test cases are added to verify all the cases. Author: gatorsmile <[email protected]> Closes #12255 from gatorsmile/warningToException.
| catalog.dropTable(TableIdentifier("unknown_table", Some("db2")), ignoreIfNotExists = false) | ||
| } | ||
| // If the table does not exist, we do not issue an exception. Instead, we output an error | ||
| // message to console when ignoreIfNotExists is set to false. This is to make it consistent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we only have a log message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is not completely consistent. let me rewrite it. Thanks!
|
@gatorsmile I have finished my review. Can you update the PR to address the following comments? |
|
Sure, will do it soon. Really thank you for your review! |
|
Also update the description if you get a chance? |
|
Sure, will do it. Thanks! |
|
Test build #55451 has finished for PR 12146 at commit
|
|
LGTM. We can remove |
…iew and Drop Table #### What changes were proposed in this pull request? This PR is to address the comment: #12146 (diff). It removes the function `isViewSupported` from `SessionCatalog`. After the removal, we still can capture the user errors if users try to drop a table using `DROP VIEW`. #### How was this patch tested? Modified the existing test cases Author: gatorsmile <[email protected]> Closes #12284 from gatorsmile/followupDropTable.
What changes were proposed in this pull request?
This PR is to provide a native support for DDL
DROP VIEWandDROP TABLE. The PR includes native parsing and native analysis.Based on the HIVE DDL document for [DROP_VIEW_WEB_LINK](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-
DropView),
DROP VIEWis defined as,Syntax:
HiveContext. InSQLContext, we will get an exception.This PR also handles
DROP TABLE.Syntax:
DROP TABLEcommand only can drop Hive tables inHiveContext. Now, after this PR, this command also can drop temporary table, external table, external data source table inSQLContext.HiveContext, we will not issue an exception if the to-be-dropped table does not exist and users did not specifyIF EXISTS. Instead, we just log an error message. IfIF EXISTSis specified, we will not issue any error message/exception.SQLContext, we will issue an exception if the to-be-dropped table does not exist, unlessIF EXISTSis specified.external, unless table type ismanaged_table.How was this patch tested?
For verifying command parsing, added test cases in
spark/sql/hive/HiveDDLCommandSuite.scalaFor verifying command analysis, added test cases in
spark/sql/hive/execution/HiveDDLSuite.scala