-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-8186][SPARK-8187][SPARK-8194][SPARK-8198][SQL] functions: date_add, date_sub, add_months, months_between, time-interval calculation #7589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we want to support:
timestamp + interval returns timestamp
date + interval returns date
date + int returns date ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will update today. BTW, How do we tell the difference if we are doing
string + interval/int,
should we just take strings as date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string can get implicitly cast to date or timestamp, can't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but then we have to decide it is date or timestamp, and that need to be done at run time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can just put timestamp first, and as a result implicitly cast string to timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think date_add('2015-07-22', 1) return '2015-07-23 00:00:00' is not so natural...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked with hive 1.2.
hive> select date_add("2015-01-01 00:11:22", 2);
2015-01-03
hive> select date_add(cast("2015-01-01 00:11:22" as timestamp), 2);
2015-01-03
shall we use the same pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
date_add is used for adding days, where does hive support timestamp + interval? in + or another special operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hive and oracle use + sign
select birthday + interval 3 days from xxx;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that makes sense. Let's use date for string then.
|
Test build #38051 has finished for PR 7589 at commit
|
|
Test build #38078 has finished for PR 7589 at commit
|
|
@rxin On my second thought, I think we should keep date_add and date_sub as simple as it should be. When it comes to Datetime IntervalType computation, we need another node specially for this. Here are my considerations:
And I have several other TODOS for the simple POC shown here, as follows:
2-1. the left operand of 2-2. Then we translate the calculation to a TimeAdd or a TimeSub. I think the whole stuff deserves another PR. |
|
Test build #1163 has finished for PR 7589 at commit
|
|
Let's cast string to "date". not "timestamp" then. We should probably also add an expression for date/timestamp + interval. In that case, I don't think we want to do implicit type cast. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin I did implement the expression for add date/timestamp and interval
|
Test build #38374 has finished for PR 7589 at commit
|
|
Test build #38376 has finished for PR 7589 at commit
|
|
Test build #38380 has finished for PR 7589 at commit
|
|
retest this please. |
|
Test build #38391 has finished for PR 7589 at commit
|
|
Test build #92 has finished for PR 7589 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we make argument names consistent with comment?
|
For months_between:
|
|
@yjshen I have checked with hive 1.2 but if we just use |
|
Test build #38513 has finished for PR 7589 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the doc
|
@adrian-wang Thanks for working on this. Since we are getting close to 1.5 cut, there are several PRs on your side, do you mind to let me take over this one? (BTW, you will still take the credit.) |
|
@davies You can go ahead. I left several comments here, please take a look before you take over this one, thanks. |
This subsumes #6782