Skip to content

Commit 17d1071

Browse files
yinxusenjkbradley
authored andcommitted
[SPARK-12834][ML][PYTHON][BACKPORT] Change ser/de of JavaArray and JavaList
Backport of SPARK-12834 for branch-1.6 Original PR: #10772 Original commit message: We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780 Author: Xusen Yin <[email protected]> Closes #10941 from jkbradley/yinxusen-SPARK-12834-1.6.
1 parent 85518ed commit 17d1071

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1464,7 +1464,11 @@ private[spark] object SerDe extends Serializable {
14641464
initialize()
14651465

14661466
def dumps(obj: AnyRef): Array[Byte] = {
1467-
new Pickler().dumps(obj)
1467+
obj match {
1468+
// Pickler in Python side cannot deserialize Scala Array normally. See SPARK-12834.
1469+
case array: Array[_] => new Pickler().dumps(array.toSeq.asJava)
1470+
case _ => new Pickler().dumps(obj)
1471+
}
14681472
}
14691473

14701474
def loads(bytes: Array[Byte]): AnyRef = {

0 commit comments

Comments
 (0)