Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Jul 4, 2018

What changes were proposed in this pull request?

This pr added code to check if nested column names do not include ',', ':', and ';' because Hive metastore can't handle these characters in nested column names;
ref: https://github.com/apache/hive/blob/release-1.2.1/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.java#L239

How was this patch tested?

Added tests in HiveDDLSuite.

@SparkQA
Copy link

SparkQA commented Jul 4, 2018

Test build #92594 has finished for PR 21711 at commit dbc300e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jul 4, 2018

@gatorsmile

@maropu
Copy link
Member Author

maropu commented Jul 11, 2018

@gatorsmile plz give me comments on this? Thanks.

}
}

test("SPARK-24681 checks if nested column names do not include ',', ':', and ';'") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it to HiveDDLSuite?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

* data column names. Partition columns do not have such a restriction. Views do not have such
* a restriction.
* Checks the validity of data column names. Hive metastore disallows the table to use some
* special characters (',', ':', and ';') in data column names. Partition columns do not have
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in data column names.
->
in data column names, including nested column names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

case st: StructType => verifyNestedColumnNames(st)
case _ if invalidChars.exists(f.name.contains) =>
throw new AnalysisException("Cannot create a table having a nested column whose name " +
s"contains invalid characters (${invalidChars.map(c => s"'$c'").mkString(", ")}) " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something wrong, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh..

@maropu maropu force-pushed the SPARK-24681 branch 6 times, most recently from 424ecba to 8a6465b Compare July 15, 2018 10:13
case st: StructType => verifyNestedColumnNames(st)
case _ if invalidChars.exists(f.name.contains) =>
val errMsg = "Cannot create a table having a nested column whose name contains " +
s"invalid characters (${invalidChars.map(c => s"'$c'").mkString(", ")}) " +
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird red highlight...the syntax seems to be correct to me (also, the test passed). Anything you know? @gatorsmile

Copy link
Member

@gatorsmile gatorsmile Jul 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, in this case, what we do is like:

val invalidCharsString = invalidChars.map(c => s"'$c'").mkString(", ")
val errMsg = "Cannot create a table having a nested column whose name contains " +
  s"invalid characters ($invalidCharsString) in Hive metastore. Table: $tableName; " +
  s"Column: ${f.name}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, I'll fix, thanks!

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93019 has finished for PR 21711 at commit 9fabeef.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93018 has finished for PR 21711 at commit fa0233e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93022 has finished for PR 21711 at commit 424ecba.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93025 has finished for PR 21711 at commit b298522.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93020 has finished for PR 21711 at commit 482a0c0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93021 has finished for PR 21711 at commit 37c9ce3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #93024 has finished for PR 21711 at commit 8a6465b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93079 has finished for PR 21711 at commit bcdae88.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jul 16, 2018

It seems like Aveo errors, so I’ll trigger when it fixed.

@maropu
Copy link
Member Author

maropu commented Jul 16, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93082 has finished for PR 21711 at commit bcdae88.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Jul 16, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93098 has finished for PR 21711 at commit bcdae88.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 2a4dd6f Jul 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants