-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24681][SQL] Verify nested column names in Hive metastore #21711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #92594 has finished for PR 21711 at commit
|
|
@gatorsmile plz give me comments on this? Thanks. |
| } | ||
| } | ||
|
|
||
| test("SPARK-24681 checks if nested column names do not include ',', ':', and ';'") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move it to HiveDDLSuite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| * data column names. Partition columns do not have such a restriction. Views do not have such | ||
| * a restriction. | ||
| * Checks the validity of data column names. Hive metastore disallows the table to use some | ||
| * special characters (',', ':', and ';') in data column names. Partition columns do not have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in data column names.
->
in data column names, including nested column names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| case st: StructType => verifyNestedColumnNames(st) | ||
| case _ if invalidChars.exists(f.name.contains) => | ||
| throw new AnalysisException("Cannot create a table having a nested column whose name " + | ||
| s"contains invalid characters (${invalidChars.map(c => s"'$c'").mkString(", ")}) " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something wrong, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh..
424ecba to
8a6465b
Compare
| case st: StructType => verifyNestedColumnNames(st) | ||
| case _ if invalidChars.exists(f.name.contains) => | ||
| val errMsg = "Cannot create a table having a nested column whose name contains " + | ||
| s"invalid characters (${invalidChars.map(c => s"'$c'").mkString(", ")}) " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a weird red highlight...the syntax seems to be correct to me (also, the test passed). Anything you know? @gatorsmile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally, in this case, what we do is like:
val invalidCharsString = invalidChars.map(c => s"'$c'").mkString(", ")
val errMsg = "Cannot create a table having a nested column whose name contains " +
s"invalid characters ($invalidCharsString) in Hive metastore. Table: $tableName; " +
s"Column: ${f.name}"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha, I'll fix, thanks!
|
Test build #93019 has finished for PR 21711 at commit
|
|
Test build #93018 has finished for PR 21711 at commit
|
|
Test build #93022 has finished for PR 21711 at commit
|
|
Test build #93025 has finished for PR 21711 at commit
|
|
Test build #93020 has finished for PR 21711 at commit
|
|
Test build #93021 has finished for PR 21711 at commit
|
|
Test build #93024 has finished for PR 21711 at commit
|
|
Test build #93079 has finished for PR 21711 at commit
|
|
It seems like Aveo errors, so I’ll trigger when it fixed. |
|
retest this please |
|
Test build #93082 has finished for PR 21711 at commit
|
|
retest this please |
|
Test build #93098 has finished for PR 21711 at commit
|
|
Thanks! Merged to master. |
What changes were proposed in this pull request?
This pr added code to check if nested column names do not include ',', ':', and ';' because Hive metastore can't handle these characters in nested column names;
ref: https://github.com/apache/hive/blob/release-1.2.1/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.java#L239
How was this patch tested?
Added tests in
HiveDDLSuite.