Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Conversation

@mccheah
Copy link

@mccheah mccheah commented Jul 18, 2017

No description provided.

for (i <- 1 to MAX_SERVER_START_ATTEMPTS) {
serverPort = new ServerSocket(0).getLocalPort
try {
server = new ResourceStagingServer(serverPort, serviceImpl, sslOptionsProvider)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to break out of the loop if this succeeds

@ash211
Copy link

ash211 commented Jul 18, 2017

Saw this exception when running tests:

sbt.ForkMain$ForkError: java.net.BindException: Address already in use
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
	at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:298)
	at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
	at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.server.Server.doStart(Server.java:431)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServer.start(ResourceStagingServer.scala:83)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite$$anonfun$3.apply$mcV$sp(ResourceStagingServerSuite.scala:65)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite$$anonfun$3.apply(ResourceStagingServerSuite.scala:64)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite$$anonfun$3.apply(ResourceStagingServerSuite.scala:64)
	at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
	at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
	at org.scalatest.Transformer.apply(Transformer.scala:22)
	at org.scalatest.Transformer.apply(Transformer.scala:20)
	at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
	at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
	at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
	at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
	at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
	at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite.org$scalatest$BeforeAndAfter$$super$runTest(ResourceStagingServerSuite.scala:43)
	at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite.runTest(ResourceStagingServerSuite.scala:43)
	at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
	at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
	at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
	at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
	at scala.collection.immutable.List.foreach(List.scala:381)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
	at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
	at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
	at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
	at org.scalatest.Suite$class.run(Suite.scala:1424)
	at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
	at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
	at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
	at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
	at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
	at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31)
	at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
	at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite.org$scalatest$BeforeAndAfter$$super$run(ResourceStagingServerSuite.scala:43)
	at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
	at org.apache.spark.deploy.rest.kubernetes.ResourceStagingServerSuite.run(ResourceStagingServerSuite.scala:43)
	at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:357)
	at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:502)
	at sbt.ForkMain$Run$2.call(ForkMain.java:296)
	at sbt.ForkMain$Run$2.call(ForkMain.java:286)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

@ash211
Copy link

ash211 commented Jul 18, 2017

@mccheah unit tests failed with:

- Accept file and jar uploads and downloads *** FAILED ***
  java.net.BindException: Address already in use
  at sun.nio.ch.Net.bind0(Native Method)
  at sun.nio.ch.Net.bind(Net.java:433)
  at sun.nio.ch.Net.bind(Net.java:425)
  at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
  at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
  at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
  at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
  at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
  at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
  at org.eclipse.jetty.server.Server.doStart(Server.java:366)
  ...
- Enable SSL on the server *** FAILED ***
  java.net.BindException: Address already in use
  at sun.nio.ch.Net.bind0(Native Method)
  at sun.nio.ch.Net.bind(Net.java:433)
  at sun.nio.ch.Net.bind(Net.java:425)
  at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
  at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
  at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
  at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
  at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
  at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
  at org.eclipse.jetty.server.Server.doStart(Server.java:366)

http://spark-k8s-jenkins.pepperdata.org:8080/job/PR-spark-k8s-unit-tests/724/consoleFull#13762536604b09c7b-0d94-4ce5-8a08-8f343248b3d8

@mccheah
Copy link
Author

mccheah commented Jul 19, 2017

We have to start the server in the try block before concluding that the port is good to use.


if (Utils.isBindCollision(e)) {
currentAttempt += 1
latestServerPort = latestServerPort + 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could do a randomization strategy more effective than +=1 but should already significantly reduce chances of collision

@ash211 ash211 merged commit 3ec9410 into branch-2.1-kubernetes Jul 19, 2017
@ash211 ash211 deleted the retry-port-test branch July 19, 2017 06:16
foxish pushed a commit that referenced this pull request Jul 24, 2017
…st. (#378)

* Retry binding server to random port in the resource staging server test.

* Break if successful start

* Start server in try block.

* FIx scalastyle

* More rigorous cleanup logic. Increment port numbers.

* Move around more exception logic.

* More exception refactoring.

* Remove whitespace

* Fix test

* Rename variable
puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019
…st. (apache-spark-on-k8s#378)

* Retry binding server to random port in the resource staging server test.

* Break if successful start

* Start server in try block.

* FIx scalastyle

* More rigorous cleanup logic. Increment port numbers.

* Move around more exception logic.

* More exception refactoring.

* Remove whitespace

* Fix test

* Rename variable
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants