-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to json for PySpark and SparkR #19223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to json for PySpark and SparkR #19223
Conversation
|
ok to test |
|
Test build #81739 has finished for PR 19223 at commit
|
R/pkg/R/functions.R
Outdated
| #' \code{to_json}: Converts a column containing a \code{structType} or array of \code{structType} | ||
| #' into a Column of JSON string. Resolving the Column can fail if an unsupported type is encountered. | ||
| #' \code{to_json}: Converts a column containing a \code{structType}, array of \code{structType}, | ||
| # a \code{mapType} or array of \code{mapType} into a Column of JSON string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ' is missed at the first #.
python/pyspark/sql/functions.py
Outdated
| JSON string. Throws an exception, in the case of an unsupported type. | ||
| Converts a column containing a [[StructType]], [[ArrayType]] of [[StructType]]s, | ||
| a [[MapType]] or [[ArrayType]] of [[MapType]] into a JSON string. | ||
| Throws an exception, in the case of an unsupported type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are here, let's fix [[StructType]] to :class:`StructType` (and the same instances too) to make Python API documentation pretty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok Thanks.
R/pkg/R/functions.R
Outdated
| #' | ||
| #' # Converts an array of maps into a JSON array | ||
| #' df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people") | ||
| #' df2 <- mutate(df2, people_json = to_json(df2$people)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And.. missing closing parentheses for \dontrun{.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... meaning }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok Thanks for careful review :)
| > SELECT to_json(array(named_struct('a', 1, 'b', 2)); | ||
| [{"a":1,"b":2}] | ||
| > SELECT to_json(map('a',named_struct('b',1))); | ||
| > SELECT to_json(map('a', named_struct('b', 1))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you committed unrelated change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you forget to commit json-functions.sql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm. I modified ExpressionDescription of StructsToJson at @HyukjinKwon 's suggestions which didn't be merged in last PR. Here's the test for describe function extended to_json, so I needed to regenerate the golden file for it. So this change isn't from json-functions.sql.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. I see.
…goldmedal/spark into SPARK-21513-fp-PySaprkAndSparkR
python/pyspark/sql/functions.py
Outdated
| """ | ||
| Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a | ||
| JSON string. Throws an exception, in the case of an unsupported type. | ||
| Converts a column containing a :class:`StructType, :class:`ArrayType` of :class:`StructType`s, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a ` after StructType.
|
LGTM except for one comment left. |
|
Test build #81766 has finished for PR 19223 at commit
|
|
Test build #81761 has finished for PR 19223 at commit
|
|
retest this please. |
|
Test build #81769 has finished for PR 19223 at commit
|
|
@HyukjinKwon @felixcheung @viirya |
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @felixcheung, do you maybe have more comments?
python/pyspark/sql/functions.py
Outdated
| """ | ||
| Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a | ||
| JSON string. Throws an exception, in the case of an unsupported type. | ||
| Converts a column containing a :class:`StructType`, :class:`ArrayType` of :class:`StructType`s, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll modified it. Because I'm not really familiar with python, thanks for your suggestions. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine :).
python/pyspark/sql/functions.py
Outdated
| Converts a column containing a [[StructType]] or [[ArrayType]] of [[StructType]]s into a | ||
| JSON string. Throws an exception, in the case of an unsupported type. | ||
| Converts a column containing a :class:`StructType`, :class:`ArrayType` of :class:`StructType`s, | ||
| a :class:`MapType` or :class:`ArrayType` of :class:`MapType` into a JSON string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`:class:`MapType` -> :class:`MapType`\\s for consistency.
|
Test build #81780 has finished for PR 19223 at commit
|
|
Test build #81781 has finished for PR 19223 at commit
|
|
LGTM |
|
AppVeyor didn't run on this?
|
|
D'oh, yes. I wonder why it was not triggered. I manually triggered via my account: Build started: [SparkR] |
|
@HyukjinKwon Thanks for triggering AppVeyor. In normal case, will AppVeyor be triggered automatically? |
|
Yes, when there are some changes in: Lines 29 to 35 in 828fab0
It should run the R tests on Windows via AppVeyor. |
|
ok. I got it. Thanks :) |
|
Looks passed fine. Let me merge this one. |
|
Thanks @felixcheung @HyukjinKwon |
|
Merged to master. |
|
Thanks @HyukjinKwon @felixcheung @viirya |

What changes were proposed in this pull request?
In previous work SPARK-21513, we has allowed
MapTypeandArrayTypeofMapTypes convert to a json string but only for Scala API. In this follow-up PR, we will make SparkSQL support it for PySpark and SparkR, too. We also fix some little bugs and comments of the previous work in this follow-up PR.For PySpark
For SparkR
How was this patch tested?
Add unit test cases.
cc @viirya @HyukjinKwon