-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-16636 Add CalendarIntervalType to documentation #16747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
@HyukjinKwon is this OK by you? |
|
I am OK but I remember there were some discussions about whether this type should be exposed or not and I could not track down the conclusion further, I remember I saw @rxin there IIRC. |
|
CC @cloud-fan for #13008 (comment) and @yhuai for #8597 (comment) as they might be what you're referring to? |
|
Actually, mine was in the jira (the comment was something like... do we really want to support this as an external blabla..... IIRC.. sorry I can't find the jira..). It seems there are several ones here and there. Maybe #15751 (comment) is related too because it is about supporting reading/writing out that type where it might refer that we can explicitly give the schema with that type.cc @mambrus too. |
|
(FWIW, I am OK but just worried if it might be supposed to be internal type, maybe in the future) |
|
@srowen As I understood |
|
^ I want to be very sure if we are not going to expose this or not. Could any SQL committer guy or PMC confirm this?
This means a lot of things for example,
scala> import org.apache.spark.sql.types.CalendarIntervalType
import org.apache.spark.sql.types.CalendarIntervalType
and etc. |
|
Actually So I'm ok to add documents for it. |
|
Then, It looks okay to me as describing the current state and I just checked it after building the doc with this, and also we can already use it as below: scala> sql("SELECT interval 1 second").schema(0).dataType.getClass
res0: Class[_ <: org.apache.spark.sql.types.DataType] = class org.apache.spark.sql.types.CalendarIntervalType$
scala> sql("SELECT interval 1 second").collect()(0).get(0).getClass
res1: Class[_] = class org.apache.spark.unsafe.types.CalendarIntervalscala> val rdd = spark.sparkContext.parallelize(Seq(Row(new CalendarInterval(0, 0))))
rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = ParallelCollectionRDD[0] at parallelize at <console>:32
scala> spark.createDataFrame(rdd, StructType(StructField("a", CalendarIntervalType) :: Nil))
res1: org.apache.spark.sql.DataFrame = [a: calendarinterval]Another meta concern is, Maybe just describe it as SQL dedicated type or not supported for now with some |
|
CC @rxin, if we are going to expose |
|
@terma To avoid confusing the Spark SQL users, we might not document it? How about closing this PR now? Thanks! |
What changes were proposed in this pull request?
Add CalendarIntervalType to SQL Data Types in documentation
How was this patch tested?
unit tests
@marmbrus please review