Skip to content

Conversation

@httfighter
Copy link
Contributor

What changes were proposed in this pull request?

keep the the mill part of the time field

How was this patch tested?

Add new test cases and update existing test cases

Please review http://spark.apache.org/contributing.html before opening a pull request.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has problems in many ways. You're treating dates as doubles and hard-coding some locale-specific concerns. But mostly, this is the wrong answer because the UNIX timestamp is in whole seconds

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@httfighter
Copy link
Contributor Author

In RDMS , unix_timestamp method can keep the milliseconds. For example, execute the command as follows
select unix_timestamp("2017-10-10 10:10:20.111") from test;
you can get the result: 1490667020.111
But the native unix_timestamp method of Spark will be lost milliseconds, we want to keep the milliseconds.

Seq(TypeCollection(StringType, DateType, TimestampType), StringType)

override def dataType: DataType = LongType
override def dataType: DataType = DoubleType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I think we can't just change this datatype directly. This could break backward compatibility.

@ouyangxiaochen
Copy link

Since the RDMS keep the milliseconds, we should follow it. This proposal LGTM. @gatorsmile CC

@viirya
Copy link
Member

viirya commented Sep 29, 2017

We also have FromUnixTime and seems the data type of unix time is defined as LongType across those unix time expressions. We shouldn't change just one expression so there is inconsistency.

As @HyukjinKwon said, we also need to not break backward compatibility.

Btw, for RDMS support, I only found MySQL has direct unix_timestamp support like this. Sounds like not a good idea to break backward compatibility just for following one (or few) RDMS.

@srowen
Copy link
Member

srowen commented Sep 29, 2017

This would break compatibility with Spark and other engines like Hive. This shoudl be closed.

@ouyangxiaochen
Copy link

In fact, there are many scenarios that need to be accurate to milliseconds, should we try to solve this problem together?

@gatorsmile
Copy link
Member

The workaround is to let users write a UDF to handle these cases

@httfighter
Copy link
Contributor Author

I understand everyone's worries.But i hava few thoughts.
Firstly, the native unix_timestamp itself supports the "yyyy-MM-dd HH:mm:ss.SSS" form of the date, but the result is lost in milliseconds when i use it. Obviously, it's a bug. It will give users the wrong results
I think this should be fixed.
Secondly, unix_timestamp, from_unixtime and to_unix_timestamp all have the similar bug, but related methods only these three. I think the data type of unix time of the three methods should be defined as DoubleType,not for LongType. Or in milliseconds which will bring more problems.

@gatorsmile
Copy link
Member

Currently, we are following Hive for these built-in functions. See https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

Maybe we can wait and see whether more users have the same requests? Then, we can see whether we should introduce new functions or introduce a SQLConf.

@srowen
Copy link
Member

srowen commented Sep 30, 2017

This itself is certainly not a bug. The type is on purpose and certainly the answer is correct given the type. You are arguing for a new function called something else but you can also do this with a UDF

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Oct 8, 2017

I'd close this for now and optionally we could ask this case and discuss in the mailing list if this is important.

@srowen srowen mentioned this pull request Nov 6, 2017
@asfgit asfgit closed this in ed1478c Nov 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants