-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-10997] [core] Add "client mode" to netty rpc env. #9210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
"Client mode" means the RPC env will not listen for incoming connections. This allows certain processes in the Spark stack (such as Executors or tha YARN client-mode AM) to act as pure clients when using the netty-based RPC backend, reducing the number of sockets needed by the app and also the number of open ports. Client connections are also preferred when endpoints that actually have a listening socket are involved; so, for example, if a Worker connects to a Master and the Master needs to send a message to a Worker endpoint, that client connection will be used, even though the Worker is also listening for incoming connections. With this change, the workaround for SPARK-10987 isn't necessary anymore, and is removed. The AM connects to the driver in "client mode", and that connection is used for all driver <-> AM communication, and so the AM is properly notified when the connection goes down. This change also removes the workaround for SPARK-10987.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this is unrelated but sbt started complaining about this dependency while testing this change.
|
Test build #44110 has finished for PR 9210 at commit
|
|
retest this please |
|
Test build #44127 has finished for PR 9210 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: _address and _name can be val.
|
Just a nit. Otherwise LGTM |
|
Test build #44185 has finished for PR 9210 at commit
|
|
retest this please |
|
Test build #44187 has finished for PR 9210 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some comment on what this is checking and when this can happen
Conflicts: core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
|
Test build #44200 has finished for PR 9210 at commit
|
|
Test build #44199 has finished for PR 9210 at commit
|
|
Test build #44298 has finished for PR 9210 at commit
|
|
hi there, any remaining feedback here? |
|
retest this please |
|
Test build #44778 has finished for PR 9210 at commit
|
|
hmm, akka test... it's failed for me in unrelated changes, also. retest this please. |
|
Test build #44787 has finished for PR 9210 at commit
|
|
Merging this to master. |
|
Just took another look. LGTM |
"Client mode" means the RPC env will not listen for incoming connections.
This allows certain processes in the Spark stack (such as Executors or
tha YARN client-mode AM) to act as pure clients when using the netty-based
RPC backend, reducing the number of sockets needed by the app and also the
number of open ports.
Client connections are also preferred when endpoints that actually have
a listening socket are involved; so, for example, if a Worker connects
to a Master and the Master needs to send a message to a Worker endpoint,
that client connection will be used, even though the Worker is also
listening for incoming connections.
With this change, the workaround for SPARK-10987 isn't necessary anymore, and
is removed. The AM connects to the driver in "client mode", and that connection
is used for all driver <-> AM communication, and so the AM is properly notified
when the connection goes down.