-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-10986][Mesos] Set the context class loader in the Mesos executor backend. #9282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This fixes the `ClassNotFoundException` for Spark classes in the serializer.
|
👍 Tested on a local Mesos cluster (fine-grained mode). |
|
Test build #44364 has finished for PR 9282 at commit
|
|
@srowen who could have a look at this one? It's a complete blocker for Mesos, fine-grained mode is the default deployment option right now. |
|
Hm, I don't know much about this part. Does this mirror similar approaches in other code? that would be good evidence that it's a good idea. How is it handled in other similar code? Usually it's better to explicitly handle classloaders where it matters rather than set the context classloader, but it could be the right approach. |
|
There's no clear answer, unfortunately. The serializer has a "default class loader" that is set explicitly only when using the REPL. In the other cases it's done through the context class loader, for example in SparkSubmit. In the coarse-grained mode there seems to be some explicit class loader handling, but in the debugger I noticed it's due to another context class loader (seems to be set by a Hadoop class via |
|
OK, I'm tentatively OK with merging this if it clearly fixes a problem, doesn't cause others (at least tests pass), you both think it's the right thing, and it resembles a similar approach elsewhere. |
|
Just curious why this suddenly becomes a problem, do you have any idea what caused this? |
|
And also +1 to merge this to fix users problems as well |
|
The ticket links to the PR that added the Netty-based RPC protocol. It sounds plausible, since the exception is thrown while initializing the Netty RPC environment, but I didn't try to revert it and see that it's really this PR. |
|
retest this please |
|
Looks like this is a regression from 1.5.1 so we should definitely fix it. Even though this change is only one line it could change a lot of things so I'd prefer to err on the conservative side. Can we verify that it doesn't cause any new regressions? @dragos can you explain to us the root cause of the issue? |
|
The serializer is delegating to the context class loader for instantiating classes it receives on the wire. When this class loader is missing ( |
|
Test build #44591 has finished for PR 9282 at commit
|
|
Hm, I'm inclined to go ahead and merge this for master, though I'm also wary of the implications. The problem seems clear, the explanation seems clear and the change is targeted. Tests pass. We may not know more until interested parties here can try this out on Mesos. |
|
+1 |
|
merged to master |
|
received ClassNotFound error in Yarn-Cluster mode(spark 1.6.2),doesn't reproduce the problem |
See SPARK-10986 for details.
This fixes the
ClassNotFoundExceptionfor Spark classes in the serializer.I am not sure this is the right way to handle the class loader, but I couldn't find any documentation on how the context class loader is used and who relies on it. It seems at least the serializer uses it to instantiate classes during deserialization.
I am open to suggestions (I tried this fix on a real Mesos cluster and it does fix the issue).
@tnachen @andrewor14