-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-8264][SQL]add substring_index function #7533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #37824 has finished for PR 7533 at commit
|
|
Test build #37941 has finished for PR 7533 at commit
|
|
Test build #38006 has finished for PR 7533 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to override the nullable method, as the default value is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's abstract in Expression which Substring_index inherited from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my bad, actually I mean the function foldable, not the nullable.
|
@chenghao-intel just refactor the code to remedy the problem of invoking numChars to much, but I still think we should add numChars as a field to UTF8String since it's quite an useful function. |
|
@rxin @chenghao-intel should I turn these functions(ordinalIndexOf, lastOrdinalIndexOf and subStringIndex) to be public in UTF8String? I guess they would be useful but on one use them except for substring_index UDF. |
|
Test build #38058 has finished for PR 7533 at commit
|
|
retest this please. |
|
Test build #55 has finished for PR 7533 at commit
|
|
Test build #38063 has finished for PR 7533 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this version of API. @rxin actually made some clean up, and removed the string (the column name) version API.
|
Test build #38186 has finished for PR 7533 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better not make it the static function, like the indexOf, it mean we are locating the substring from CURRENT string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give more verbose info, like this.toString()?
|
Test build #38294 has finished for PR 7533 at commit
|
|
Test build #38298 has finished for PR 7533 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont' need to check the null value here, as it's done in the expression side.
|
Test build #38317 has finished for PR 7533 at commit
|
|
@rxin it looks a few better now. could you take a look at this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: throws more concrete exception like IllegalArgumentException or IllegalCharacterException etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use TernaryExpression ?
|
@zhichao-li Do you mind me to take over this one? |
No description provided.