-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-14402][SQL] initcap UDF doesn't match Hive/Oracle behavior in lowercasing rest of string #12175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tters in lowercase
|
Test build #54976 has finished for PR 12175 at commit
|
… stringExpression.
|
Hi, @srowen . I minimized the change on master.
override def nullSafeEval(string: Any): Any = {
- string.asInstanceOf[UTF8String].toTitleCase
+ string.asInstanceOf[UTF8String].toLowerCase.toTitleCase
}
override def genCode(ctx: CodegenContext, ev: ExprCode): String = {
- defineCodeGen(ctx, ev, str => s"$str.toTitleCase()")
+ defineCodeGen(ctx, ev, str => s"$str.toLowerCase().toTitleCase()")
}I think it's enough for |
|
I think that's pretty reasonable as a minimally invasive fix. CC @marmbrus for visibility as it's technically a behavior change |
|
Thank you, @srowen ! |
|
It does seem reasonable to match hive since that was probably the original intention. I've tagged the JIRA for inclusion in the release notes. A few comments:
|
|
Thank you, @marmbrus . I will update the scala docand add description annotation for InitCap. |
|
Test build #55000 has finished for PR 12175 at commit
|
|
Test build #55006 has finished for PR 12175 at commit
|
|
Test build #55009 has finished for PR 12175 at commit
|
|
Thanks, merging to master. |
What changes were proposed in this pull request?
Current, SparkSQL
initCapis usingtoTitleCasefunction. However,UTF8String.toTitleCaseimplementation changes only the first letter and just copy the other letters: e.g. sParK --> SParK. This is the correct implementationtoTitleCase.This PR updates the implementation of
initcapusingtoLowerCaseandtoTitleCase.How was this patch tested?
Pass the Jenkins tests (including new testcase).