Skip to content

Conversation

@ghidi
Copy link

@ghidi ghidi commented May 19, 2014

When I load a RDD that has custom serialized objects, Spark throws ClassNotFoundException. This happens only when Spark is deployed as a standalone cluster, it works fine when Spark is local.

I debugged the issue and I noticed that ObjectInputStream.resolveClass does not use ExecutorURLClassLoader set by SparkSubmit. You have to explicitly set the classloader in SparkContext.objectFile for ObjectInputStream when deserializing objects.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@mateiz
Copy link
Contributor

mateiz commented May 19, 2014

Jenkins, this is ok to test

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15086/

@tdas
Copy link
Contributor

tdas commented May 19, 2014

Is it possible to give a unit test that highlights this problem? Since it works locally, you can try using "local-cluster" mode (see DistributedSuite if you dont know what it is) to reproduce and test it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change this to Utils.getContextOrSparkClassLoader? It's not always the case that the context classloader is set, and the utility function deals with that correctly.

@pwendell
Copy link
Contributor

LGTM - I pending a small change (we have a utility function to get the classloader).

asfgit pushed a commit that referenced this pull request May 20, 2014
…objects

Updated version of #821

Author: Tathagata Das <[email protected]>
Author: Ghidireac <[email protected]>

Closes #835 from tdas/SPARK-1877 and squashes the following commits:

f346f71 [Tathagata Das] Addressed Patrick's comments.
fee0c5d [Ghidireac] SPARK-1877: ClassNotFoundException when loading RDD with serialized objects

(cherry picked from commit 52eb54d)
Signed-off-by: Tathagata Das <[email protected]>
asfgit pushed a commit that referenced this pull request May 20, 2014
…objects

Updated version of #821

Author: Tathagata Das <[email protected]>
Author: Ghidireac <[email protected]>

Closes #835 from tdas/SPARK-1877 and squashes the following commits:

f346f71 [Tathagata Das] Addressed Patrick's comments.
fee0c5d [Ghidireac] SPARK-1877: ClassNotFoundException when loading RDD with serialized objects
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@ghidi
Copy link
Author

ghidi commented May 20, 2014

I changed Thread.getContextClassLoader with Utils.getContextOrSparkClassLoader.

@tdas
Copy link
Contributor

tdas commented May 20, 2014

Hey @ghidi

Sorry I should have mentioned. In order to speed up the process (so that I can cut another RC for Spark 1.0), I cloned your branch and made the fix myself and merged it. This was the PR #835, it has your commit.

If you dont mind, can you close this PR.

@ghidi ghidi closed this May 20, 2014
@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15105/

pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
…objects

Updated version of apache#821

Author: Tathagata Das <[email protected]>
Author: Ghidireac <[email protected]>

Closes apache#835 from tdas/SPARK-1877 and squashes the following commits:

f346f71 [Tathagata Das] Addressed Patrick's comments.
fee0c5d [Ghidireac] SPARK-1877: ClassNotFoundException when loading RDD with serialized objects
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants