-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[MINOR][SQL][DOCS][2.4] Fix the timestamp pattern in the example for to_timestamp
#27438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@dongjoon-hyun Please, take a look at this. |
|
Test build #117752 has finished for PR 27438 at commit
|
|
It's too late for RC2, @MaxGekk . Let's consider this PR after RC2 vote. |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
Outdated
Show resolved
Hide resolved
|
In general, this seems to be a follow-up of SPARK-23792 (at 2.4.0). Since it's too old JIRA, we had better use another JIRA or use |
|
@MaxGekk . Every committer's suggestion is his/her own criteria for his/her acceptance. I am also able to merge only what I can agree. And, as you know, the other committers also have different opinion and they will merge this if they agree with AS-IS status more. It always does. In addition, I don't complain about the other committer's decision when I understand it's on the edge. Since this PR is yours, it's up to you always~ @MaxGekk . |
|
Anyway, thank you always for you active contribution. Apache Spark community really needs that. 😄 |
to_timestamp and ParseToTimestampto_timestamp
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Since @dongjoon-hyun is preparing the release, I will leave it to him though.
|
Test build #117767 has finished for PR 27438 at commit
|
| * See [[java.text.SimpleDateFormat]] for valid date and time format patterns | ||
| * | ||
| * @param s A date, timestamp or string. If a string, the data must be in a format that can be | ||
| * cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's wrong with ".SSSS"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The result is incorrect, look at the PR which fixes that [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources #26507, and here is an example:
val sdf = new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSS")
val res = sdf.parse("1970-01-01 00:00:00.1234")
println(sdf.format(res))
1970-01-01 00:00:01.0234
- And the result of
to_timestampis truncated to seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 1), is it fixed in 2.4 or not?
for 2), have we documented it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SPARK-29904 is merged at 2.4.5. In the worst case, we can revert @MaxGekk 's #26507 because of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 1), is it fixed in 2.4 or not?
@cloud-fan It is fixed in JSON and CSV datasources only but not for to_timestamp() and other functions.
for 2), have we documented it?
I have tried to document it here in the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the worst case, we can revert @MaxGekk 's #26507 because of this.
@dongjoon-hyun Why? What's the worst case? I do think we should apply the fix in other places but not reverting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I tried to document the restriction of to_timestmp() in the PR but @dongjoon-hyun convinced me to do not do that here, see #27438 (comment) " I believe that one thing we need is that yyyy-MM-dd HH:mm:ss.SSSS -> yyyy-MM-dd HH:mm:ss change. I recommend to focus on the above change only."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxGekk . To be clear, I want to make it sure we don't have any regression at 2.4.5 as a release manager. You raised this issue and we are investigating the relevant PRs. That's all. We are considering the scope and effect. So far, we didn't make any decision.
| /** | ||
| * Converts time string with the given pattern to timestamp. | ||
| * | ||
| * See [[java.text.SimpleDateFormat]] for valid date and time format patterns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind, I would like to say here that currently supported pattern for seconds fractions is SSS only. The second s fractions can be parsed but will be ignored while casting to TimestampType.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (AS-IS). Thanks, @MaxGekk and @HyukjinKwon .
Let's postpone this PR during RC2 vote period.
|
maybe we can just fix the issue in |
@cloud-fan I have tried to port all functions on def newDateFormat(formatString: String, timeZone: TimeZone): TimestampParser = {
new TimestampParser(FastDateFormat.getInstance(formatString, timeZone))
}from def newDateFormat(formatString: String, timeZone: TimeZone): DateFormat = {
val sdf = new SimpleDateFormat(formatString, Locale.US)
sdf.setTimeZone(timeZone)
// Enable strict parsing, if the input date/format is invalid, it will throw an exception.
// e.g. to parse invalid date '2016-13-12', or '2016-01-12' with invalid format 'yyyy-aa-dd',
// an exception will be throwed.
sdf.setLenient(false)
sdf
}but I got a few test failures due to strict mode set via |
|
OK then this doc change LGTM |
|
Thank you for the investigation, @MaxGekk . And, thank you for the conclusion, @cloud-fan . |
|
@dongjoon-hyun Kindly remind you about this minor fix. |
|
Sure, @MaxGekk . Sorry for making you wait. Merged to |
…`to_timestamp` ### What changes were proposed in this pull request? In the PR, I propose to change the description of the `to_timestamp()` function, and change the pattern in the example. ### Why are the changes needed? To inform users about valid patterns for `to_timestamp` function. ### Does this PR introduce any user-facing change? No ### How was this patch tested? N/A Closes #27438 from MaxGekk/to_timestamp-z-2.4. Authored-by: Maxim Gekk <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…`to_timestamp` ### What changes were proposed in this pull request? In the PR, I propose to change the description of the `to_timestamp()` function, and change the pattern in the example. ### Why are the changes needed? To inform users about valid patterns for `to_timestamp` function. ### Does this PR introduce any user-facing change? No ### How was this patch tested? N/A Closes apache#27438 from MaxGekk/to_timestamp-z-2.4. Authored-by: Maxim Gekk <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
In the PR, I propose to change the description of the
to_timestamp()function, and change the pattern in the example.Why are the changes needed?
To inform users about valid patterns for
to_timestampfunction.Does this PR introduce any user-facing change?
No
How was this patch tested?
N/A