Skip to content

Conversation

@dilipbiswal
Copy link
Contributor

What changes were proposed in this pull request?

SHOW COLUMNS command validates the user supplied database
name with database name from qualified table name name to make
sure both of them are consistent. This comparison should respect
case sensitivity.

How was this patch tested?

Added tests in DDLSuite and existing tests were moved to use new sql based test infrastructure.

@SparkQA
Copy link

SparkQA commented Oct 11, 2016

Test build #66692 has finished for PR 15423 at commit 3acd08f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ShowColumnsCommand(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ShowColumnsCommand is sorted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya I added the case here because of
https://github.com/dilipbiswal/spark/blob/3acd08f9431d6cdfe4d043aa342d09fc0ebfa497/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala#L222

I didn't want the output of ShowColumnsCommand to be sorted before comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it affect comparison result? I think the result is generated, right?

Copy link
Contributor Author

@dilipbiswal dilipbiswal Oct 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya So it seemed odd to have the generated output files to have column names sorted which didn't reflect the column order of create table. In the test case i had the table create like following.

CREATE TABLE showcolumn2 (price int, qty int) partitioned by (year int, month int)

It seemed odd to me to have the generated output file report the columns as month, price, qty and year as opposed to price, qty, year and month. Let me know if i am missing something here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think it is odd because we just want to compare the results. Adding ShowColumnsCommand to sorted op looks more odd to me. cc @cloud-fan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it break test if we don't mark it as sorted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan @viirya Actually it does not break the test if we don't mark it as sorted. What happens is that, when we generate the expected output file, the results appear sorted like following:

-- !query 7
SHOW COLUMNS IN showcolumn2 IN showdb
-- !query 7 schema
struct<col_name:string>
-- !query 7 output
month
price
qty
year

When i was going through the expected output file to make sure its correct, i noticed this as the above output would not be how it would be shown if i cut-paste the SQLs snippets from the test file and ran it in spark-sql shell.

If you guys think its okay to have the output in sorted form in the expected file, then i will change it back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking ShowColumnsCommand as sorted is more weird, I'd like to leave the result sorted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as mentioned in previous comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan @viirya Thanks :-) I will change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explicitly set the current database here? If the current database is not default, the test is not making sense anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya OK.. I agree. I will make the change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, MySQL will treat SHOW COLUMNS FROM db1.tbl1 FROM db2 as SHOW COLUMNS FROM tbl1 FROM db2, i.e. if FROM database is specified, it will just ignore the database specified in table name, instead of reporting error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should investigate more databases to see how they handle this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, what @dilipbiswal does in this is following previous behavior, do we want to break it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viirya for checking hive. Yeah, when we implemented the SHOW columns native command, wde went through the the above hive code. We decided to improve up on the above check and report an error only when the database names are different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong option towards this. From my point, MySQL's way might be little confusing users if they don't notice the database name is different.

@SparkQA
Copy link

SparkQA commented Oct 17, 2016

Test build #67060 has finished for PR 15423 at commit cb0691c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

@viirya @cloud-fan I have incorporated the review comments. Could we please look at this again ?

@viirya
Copy link
Member

viirya commented Oct 18, 2016

LGTM, see if @cloud-fan has more comments on this or not?

override def run(sparkSession: SparkSession): Seq[Row] = {
val catalog = sparkSession.sessionState.catalog
val table = catalog.getTempViewOrPermanentTableMetadata(tableName)
val caseSensitive = sparkSession.sessionState.conf.caseSensitiveAnalysis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can simplify it to

val resolver = sparkSession.sessionState.conf.resolver
...
case Some(db) if tableName.database.exists(!resolver(_, db))

USE showdb;

CREATE TABLE showcolumn1 (col1 int, `col 2` int);
CREATE TABLE showcolumn2 (price int, qty int) partitioned by (year int, month int);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also test temp view?

@SparkQA
Copy link

SparkQA commented Oct 19, 2016

Test build #67177 has finished for PR 15423 at commit 586a6b4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

@cloud-fan Hi wenchen, i have added the test cases for temp view. Could we please look at this again? Thanks !

withTable(tabName) {
sql(s"CREATE TABLE $tabName(col1 int, col2 string) USING parquet ")
val message = intercept[AnalysisException] {
sql(s"SHOW COLUMNS IN $db.showcolumn FROM ${db.toUpperCase}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wrong ident here.

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Oct 20, 2016

Test build #67237 has started for PR 15423 at commit 15c568f.

@viirya
Copy link
Member

viirya commented Oct 20, 2016

The tests are passed but the results are failed to post back to github...

@viirya
Copy link
Member

viirya commented Oct 20, 2016

@cloud-fan Need to run tests again?

@cloud-fan
Copy link
Contributor

it's fine, merging to master!

@asfgit asfgit closed this in e895bc2 Oct 20, 2016
@dilipbiswal
Copy link
Contributor Author

@cloud-fan @viirya Thank you very much !!

robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
…ct case sensitivity configuration

## What changes were proposed in this pull request?
SHOW COLUMNS command validates the user supplied database
name with database name from qualified table name name to make
sure both of them are consistent. This comparison should respect
case sensitivity.

## How was this patch tested?
Added tests in DDLSuite and existing tests were moved to use new sql based test infrastructure.

Author: Dilip Biswal <[email protected]>

Closes apache#15423 from dilipbiswal/dkb_show_column_fix.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…ct case sensitivity configuration

## What changes were proposed in this pull request?
SHOW COLUMNS command validates the user supplied database
name with database name from qualified table name name to make
sure both of them are consistent. This comparison should respect
case sensitivity.

## How was this patch tested?
Added tests in DDLSuite and existing tests were moved to use new sql based test infrastructure.

Author: Dilip Biswal <[email protected]>

Closes apache#15423 from dilipbiswal/dkb_show_column_fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants