Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2967,7 +2967,7 @@ object functions {
* See [[java.text.SimpleDateFormat]] for valid date and time format patterns
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind, I would like to say here that currently supported pattern for seconds fractions is SSS only. The second s fractions can be parsed but will be ignored while casting to TimestampType.

*
* @param s A date, timestamp or string. If a string, the data must be in a format that can be
* cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's wrong with ".SSSS"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The result is incorrect, look at the PR which fixes that [SPARK-29904][SQL][2.4] Parse timestamps in microsecond precision by JSON/CSV datasources #26507, and here is an example:
    val sdf = new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSS")
    val res = sdf.parse("1970-01-01 00:00:00.1234")
    println(sdf.format(res))

1970-01-01 00:00:01.0234
  1. And the result of to_timestamp is truncated to seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 1), is it fixed in 2.4 or not?

for 2), have we documented it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPARK-29904 is merged at 2.4.5. In the worst case, we can revert @MaxGekk 's #26507 because of this.

Copy link
Member Author

@MaxGekk MaxGekk Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 1), is it fixed in 2.4 or not?

@cloud-fan It is fixed in JSON and CSV datasources only but not for to_timestamp() and other functions.

for 2), have we documented it?

I have tried to document it here in the PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the worst case, we can revert @MaxGekk 's #26507 because of this.

@dongjoon-hyun Why? What's the worst case? I do think we should apply the fix in other places but not reverting it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan I tried to document the restriction of to_timestmp() in the PR but @dongjoon-hyun convinced me to do not do that here, see #27438 (comment) " I believe that one thing we need is that yyyy-MM-dd HH:mm:ss.SSSS -> yyyy-MM-dd HH:mm:ss change. I recommend to focus on the above change only."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGekk . To be clear, I want to make it sure we don't have any regression at 2.4.5 as a release manager. You raised this issue and we are investigating the relevant PRs. That's all. We are considering the scope and effect. So far, we didn't make any decision.

* cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss`
* @param fmt A date time pattern detailing the format of `s` when `s` is a string
* @return A timestamp, or null if `s` was a string that could not be cast to a timestamp or
* `fmt` was an invalid format
Expand Down