Skip to content

Commit ae47ba7

Browse files
yinxusenjkbradley
authored andcommitted
[SPARK-12834] Change ser/de of JavaArray and JavaList
https://issues.apache.org/jira/browse/SPARK-12834 We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780 Author: Xusen Yin <[email protected]> Closes #10772 from yinxusen/SPARK-12834.
1 parent b66afde commit ae47ba7

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1490,7 +1490,11 @@ private[spark] object SerDe extends Serializable {
14901490
initialize()
14911491

14921492
def dumps(obj: AnyRef): Array[Byte] = {
1493-
new Pickler().dumps(obj)
1493+
obj match {
1494+
// Pickler in Python side cannot deserialize Scala Array normally. See SPARK-12834.
1495+
case array: Array[_] => new Pickler().dumps(array.toSeq.asJava)
1496+
case _ => new Pickler().dumps(obj)
1497+
}
14941498
}
14951499

14961500
def loads(bytes: Array[Byte]): AnyRef = {

0 commit comments

Comments
 (0)