Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Oct 22, 2015

"Client mode" means the RPC env will not listen for incoming connections.
This allows certain processes in the Spark stack (such as Executors or
tha YARN client-mode AM) to act as pure clients when using the netty-based
RPC backend, reducing the number of sockets needed by the app and also the
number of open ports.

Client connections are also preferred when endpoints that actually have
a listening socket are involved; so, for example, if a Worker connects
to a Master and the Master needs to send a message to a Worker endpoint,
that client connection will be used, even though the Worker is also
listening for incoming connections.

With this change, the workaround for SPARK-10987 isn't necessary anymore, and
is removed. The AM connects to the driver in "client mode", and that connection
is used for all driver <-> AM communication, and so the AM is properly notified
when the connection goes down.

"Client mode" means the RPC env will not listen for incoming connections.
This allows certain processes in the Spark stack (such as Executors or
tha YARN client-mode AM) to act as pure clients when using the netty-based
RPC backend, reducing the number of sockets needed by the app and also the
number of open ports.

Client connections are also preferred when endpoints that actually have
a listening socket are involved; so, for example, if a Worker connects
to a Master and the Master needs to send a message to a Worker endpoint,
that client connection will be used, even though the Worker is also
listening for incoming connections.

With this change, the workaround for SPARK-10987 isn't necessary anymore, and
is removed. The AM connects to the driver in "client mode", and that connection
is used for all driver <-> AM communication, and so the AM is properly notified
when the connection goes down.

This change also removes the workaround for SPARK-10987.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this is unrelated but sbt started complaining about this dependency while testing this change.

@vanzin
Copy link
Contributor Author

vanzin commented Oct 22, 2015

/cc @rxin @zsxwing

I tested this on a real YARN cluster with dynamic allocation (= also used the external shuffle service), everything looks fine.

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44110 has finished for PR 9210 at commit 48f3c55.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@vanzin
Copy link
Contributor Author

vanzin commented Oct 22, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44127 has finished for PR 9210 at commit 48f3c55.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: _address and _name can be val.

@zsxwing
Copy link
Member

zsxwing commented Oct 22, 2015

Just a nit. Otherwise LGTM

@SparkQA
Copy link

SparkQA commented Oct 23, 2015

Test build #44185 has finished for PR 9210 at commit da4d824.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@vanzin
Copy link
Contributor Author

vanzin commented Oct 23, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Oct 23, 2015

Test build #44187 has finished for PR 9210 at commit da4d824.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some comment on what this is checking and when this can happen

Marcelo Vanzin added 2 commits October 22, 2015 21:44
Conflicts:
	core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
@SparkQA
Copy link

SparkQA commented Oct 23, 2015

Test build #44200 has finished for PR 9210 at commit 418b472.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@SparkQA
Copy link

SparkQA commented Oct 23, 2015

Test build #44199 has finished for PR 9210 at commit 11d9092.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@SparkQA
Copy link

SparkQA commented Oct 24, 2015

Test build #44298 has finished for PR 9210 at commit acce31f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@vanzin
Copy link
Contributor Author

vanzin commented Oct 28, 2015

hi there, any remaining feedback here?

@vanzin
Copy link
Contributor Author

vanzin commented Nov 1, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44778 has finished for PR 9210 at commit acce31f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@vanzin
Copy link
Contributor Author

vanzin commented Nov 2, 2015

hmm, akka test... it's failed for me in unrelated changes, also. retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44787 has finished for PR 9210 at commit acce31f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class RegisteredExecutor(hostname: String) extends CoarseGrainedClusterMessage\n

@vanzin
Copy link
Contributor Author

vanzin commented Nov 2, 2015

Merging this to master.

@asfgit asfgit closed this in 71d1c90 Nov 2, 2015
@zsxwing
Copy link
Member

zsxwing commented Nov 2, 2015

Just took another look. LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants