This repository was archived by the owner on Jan 9, 2020. It is now read-only.
forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 117
This repository was archived by the owner on Jan 9, 2020. It is now read-only.
Better error message for failure to connect to Nodes #91
Copy link
Copy link
Open
Labels
Description
I was running on a cluster where the firewall did not allow connections to the Nodes.
I would have expected an error message saying that the file could not be uploaded.
It instead continues to show that the pod is running and fails unexpectedly.
2017-02-07 22:34:49 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:50 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:51 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:52 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:53 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:54 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:55 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:56 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:57 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:58 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:34:59 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:00 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:01 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:02 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:03 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:04 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:05 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:06 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:07 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:08 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:09 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
2017-02-07 22:35:10 ERROR Client:91 - The driver pod with name foxish-1486535647246 in namespace default was not ready in 60 seconds.
Latest phase from the pod is: Running
The pod had no final message.
Driver container last state: Running
Driver container started at: 2017-02-08T06:34:10Z
java.util.concurrent.TimeoutException: Timeout waiting for task.
at org.spark_project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
at org.spark_project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:96)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5$$anonfun$apply$7.apply(Client.scala:189)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5$$anonfun$apply$7.apply(Client.scala:148)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2530)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5.apply(Client.scala:148)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5.apply(Client.scala:133)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2530)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6.apply(Client.scala:133)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6.apply(Client.scala:105)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2530)
at org.apache.spark.deploy.kubernetes.Client.run(Client.scala:105)
at org.apache.spark.deploy.kubernetes.Client$.main(Client.scala:682)
at org.apache.spark.deploy.kubernetes.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:750)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:178)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:117)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2017-02-07 22:35:10 INFO LoggingPodStatusWatcher:54 - Application status for foxish-1486535647246 (phase: Running)
Exception in thread "OkHttp https://104.154.43.148/api/v1/namespaces/default/pods?labelSelector=spark-app-id%3Dfoxish-1486535647246,spark-app-name%3Dfoxish,spark-driver%3Dfoxish-1486535647246&resourceVersion=3440763&watch=true WebSocket" java.lang.NullPointerException
at org.spark_project.guava.base.Preconditions.checkNotNull(Preconditions.java:191)
at org.spark_project.guava.util.concurrent.AbstractFuture.setException(AbstractFuture.java:201)
at org.spark_project.guava.util.concurrent.SettableFuture.setException(SettableFuture.java:68)
at org.apache.spark.deploy.kubernetes.Client$DriverPodWatcher.onClose(Client.scala:459)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onClose(WatchConnectionManager.java:259)
at okhttp3.internal.ws.RealWebSocket.peerClose(RealWebSocket.java:197)
at okhttp3.internal.ws.RealWebSocket.access$200(RealWebSocket.java:38)
at okhttp3.internal.ws.RealWebSocket$1$2.execute(RealWebSocket.java:84)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Exception in thread "main" org.apache.spark.SparkException: The driver pod with name foxish-1486535647246 in namespace default was not ready in 60 seconds.
Latest phase from the pod is: Running
The pod had no final message.
Driver container last state: Running
Driver container started at: 2017-02-08T06:34:10Z
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5$$anonfun$apply$7.apply(Client.scala:196)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5$$anonfun$apply$7.apply(Client.scala:148)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2530)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5.apply(Client.scala:148)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5.apply(Client.scala:133)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2530)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6.apply(Client.scala:133)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6.apply(Client.scala:105)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2530)
at org.apache.spark.deploy.kubernetes.Client.run(Client.scala:105)
at org.apache.spark.deploy.kubernetes.Client$.main(Client.scala:682)
at org.apache.spark.deploy.kubernetes.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:750)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:178)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:117)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.TimeoutException: Timeout waiting for task.
at org.spark_project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
at org.spark_project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:96)
at org.apache.spark.deploy.kubernetes.Client$$anonfun$run$6$$anonfun$apply$5$$anonfun$apply$7.apply(Client.scala:189)
... 20 more