-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-1877: ClassNotFoundException when loading RDD with serialized obje... #821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
Jenkins, this is ok to test |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15086/ |
|
Is it possible to give a unit test that highlights this problem? Since it works locally, you can try using "local-cluster" mode (see DistributedSuite if you dont know what it is) to reproduce and test it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change this to Utils.getContextOrSparkClassLoader? It's not always the case that the context classloader is set, and the utility function deals with that correctly.
|
LGTM - I pending a small change (we have a utility function to get the classloader). |
…objects Updated version of #821 Author: Tathagata Das <[email protected]> Author: Ghidireac <[email protected]> Closes #835 from tdas/SPARK-1877 and squashes the following commits: f346f71 [Tathagata Das] Addressed Patrick's comments. fee0c5d [Ghidireac] SPARK-1877: ClassNotFoundException when loading RDD with serialized objects (cherry picked from commit 52eb54d) Signed-off-by: Tathagata Das <[email protected]>
…objects Updated version of #821 Author: Tathagata Das <[email protected]> Author: Ghidireac <[email protected]> Closes #835 from tdas/SPARK-1877 and squashes the following commits: f346f71 [Tathagata Das] Addressed Patrick's comments. fee0c5d [Ghidireac] SPARK-1877: ClassNotFoundException when loading RDD with serialized objects
…with serialized objects
|
Merged build triggered. |
|
Merged build started. |
|
I changed Thread.getContextClassLoader with Utils.getContextOrSparkClassLoader. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
…objects Updated version of apache#821 Author: Tathagata Das <[email protected]> Author: Ghidireac <[email protected]> Closes apache#835 from tdas/SPARK-1877 and squashes the following commits: f346f71 [Tathagata Das] Addressed Patrick's comments. fee0c5d [Ghidireac] SPARK-1877: ClassNotFoundException when loading RDD with serialized objects
When I load a RDD that has custom serialized objects, Spark throws ClassNotFoundException. This happens only when Spark is deployed as a standalone cluster, it works fine when Spark is local.
I debugged the issue and I noticed that ObjectInputStream.resolveClass does not use ExecutorURLClassLoader set by SparkSubmit. You have to explicitly set the classloader in SparkContext.objectFile for ObjectInputStream when deserializing objects.