This repository was archived by the owner on Jan 9, 2020. It is now read-only.
forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 117
This repository was archived by the owner on Jan 9, 2020. It is now read-only.
Run example failed with: Executor cannot find driver pod #202
Copy link
Copy link
Closed
Description
I run example by following here:
# spark-submit \
> --deploy-mode cluster \
> --class org.apache.spark.examples.SparkPi \
> --master k8s://http://localhost:8080 \
> --kubernetes-namespace dbyin \
> --conf spark.executor.instances=5 \
> --conf spark.app.name=spark-pi \
> --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1.0-kubernetes-0.1.0-rc1 \
> --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1.0-kubernetes-0.1.0-rc1 \
> examples/jars/spark-examples_2.11-2.1.0-k8s-0.1.0-SNAPSHOT.jar
...
2017-03-24 11:32:42 INFO LoggingPodStatusWatcher:54 - Application status for spark-pi-1490326341602 (phase: Running)
2017-03-24 11:32:43 INFO LoggingPodStatusWatcher:54 - State changed, new state:
pod name: spark-pi-1490326341602
namespace: dbyin
labels: spark-app-id -> spark-pi-1490326341602, spark-app-name -> spark-pi, spark-driver -> spark-pi-1490326341602
pod uid: 7e15865a-1042-11e7-9180-6c0b84be1f24
creation time: 2017-03-24T03:32:22Z
service account name: default
volumes: spark-submission-secret-volume, default-token-rd4n0
node name: 100.x.x.x
start time: 2017-03-24T03:32:22Z
container images: kubespark/spark-driver:v2.1.0-kubernetes-0.1.0-rc1
phase: Running
status: [ContainerStatus(containerID=docker://3dace9a96700be04a49c7f011050494de35bdb6d5b2f6c3d2f09177149cc5849, image=kubespark/spark-driver:v2.1.0-kubernetes-0.1.0-rc1, imageID=docker://sha256:b5dbf02827567eed33fac7f94ca99fd408076db872f49280c2615837068775df, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=true, restartCount=0, state=ContainerState(running=ContainerStateRunning(startedAt=2017-03-24T03:32:37Z, additionalProperties={}), terminated=null, waiting=null, additionalProperties={}), additionalProperties={})]
2017-03-24 11:32:43 INFO Client:54 - Driver pod successfully created in Kubernetes cluster.
2017-03-24 11:32:43 INFO Client:54 - Driver service created successfully in Kubernetes.
2017-03-24 11:32:43 INFO Client:54 - Driver endpoints ready to receive application submission
2017-03-24 11:32:43 WARN NodePortUrisDriverServiceManager:66 - Submitting application details, application secret, Kubernetes credentials, and local jars to the cluster over an insecure connection. You should configure SSL to secure this step.
2017-03-24 11:32:43 INFO Client:54 - Submitting local resources to driver pod for application spark-pi-1490326341602 ...
2017-03-24 11:32:43 INFO Client:54 - Successfully submitted local resources and driver configuration to driver pod.
2017-03-24 11:32:43 INFO Client:54 - Finished submitting application to Kubernetes.
2017-03-24 11:32:43 INFO KubernetesResourceCleaner:54 - Deleting 1 registered Kubernetes resources...
2017-03-24 11:32:43 INFO KubernetesResourceCleaner:54 - Deleted 1 registered Kubernetes resources.
2017-03-24 11:32:43 INFO Client:54 - Waiting for application spark-pi-1490326341602 to finish...
2017-03-24 11:32:43 INFO LoggingPodStatusWatcher:54 - Application status for spark-pi-1490326341602 (phase: Running)
2017-03-24 11:32:44 INFO LoggingPodStatusWatcher:54 - Application status for spark-pi-1490326341602 (phase: Running)
2017-03-24 11:32:45 INFO LoggingPodStatusWatcher:54 - Application status for spark-pi-1490326341602 (phase: Running)
2017-03-24 11:32:46 INFO LoggingPodStatusWatcher:54 - Application status for spark-pi-1490326341602 (phase: Running)
2017-03-24 11:32:47 INFO LoggingPodStatusWatcher:54 - State changed, new state:
pod name: spark-pi-1490326341602
namespace: dbyin
labels: spark-app-id -> spark-pi-1490326341602, spark-app-name -> spark-pi, spark-driver -> spark-pi-1490326341602
pod uid: 7e15865a-1042-11e7-9180-6c0b84be1f24
creation time: 2017-03-24T03:32:22Z
service account name: default
volumes: spark-submission-secret-volume, default-token-rd4n0
node name: 100.x.x.x
start time: 2017-03-24T03:32:22Z
container images: kubespark/spark-driver:v2.1.0-kubernetes-0.1.0-rc1
phase: Failed
status: [ContainerStatus(containerID=docker://3dace9a96700be04a49c7f011050494de35bdb6d5b2f6c3d2f09177149cc5849, image=kubespark/spark-driver:v2.1.0-kubernetes-0.1.0-rc1, imageID=docker://sha256:b5dbf02827567eed33fac7f94ca99fd408076db872f49280c2615837068775df, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=false, restartCount=0, state=ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://3dace9a96700be04a49c7f011050494de35bdb6d5b2f6c3d2f09177149cc5849, exitCode=1, finishedAt=2017-03-24T03:32:46Z, message=null, reason=Error, signal=null, startedAt=2017-03-24T03:32:37Z, additionalProperties={}), waiting=null, additionalProperties={}), additionalProperties={})]
2017-03-24 11:32:47 INFO Client:54 - Application spark-pi-1490326341602 finished.
2017-03-24 11:32:47 INFO ShutdownHookManager:54 - Shutdown hook calledKube pod's error logs:
...
2017-03-24 03:32:45 INFO ServerConnector:266 - Started ServerConnector@4409e975{HTTP/1.1}{0.0.0.0:4040}
2017-03-24 03:32:45 INFO Server:379 - Started @1500ms
2017-03-24 03:32:45 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040.
2017-03-24 03:32:45 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://192.168.16.30:4040
2017-03-24 03:32:45 INFO SparkContext:54 - Added JAR /tmp/spark-10597f49-5dc5-4b6b-a8c3-49c7ae111ee7/spark-examples_2.11-2.1.0-k8s-0.1.0-SNAPSHOT.jar at spark://192.168.16.30:7078/jars/spark-examples_2.11-2.1.0-k8s-0.1.0-SNAPSHOT.jar with timestamp 1490326365327
2017-03-24 03:32:45 ERROR KubernetesClusterSchedulerBackend:91 - Executor cannot find driver pod.
io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:61)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:52)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:199)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:84)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:82)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:34)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2554)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:961)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:267)
at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:237)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:148)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:186)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179)
at okhttp3.RealCall.execute(RealCall.java:63)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:237)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:232)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:228)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:711)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:192)
... 12 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292)
at sun.security.validator.Validator.validate(Validator.java:260)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1496)
... 46 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382)
... 52 more
2017-03-24 03:32:45 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: Executor cannot find driver pod
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:88)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:82)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:34)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2554)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:61)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:52)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:199)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:84)
... 11 more
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:961)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:267)
at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:237)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:148)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:186)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179)
at okhttp3.RealCall.execute(RealCall.java:63)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:237)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:232)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:228)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:711)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:192)
... 12 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292)
at sun.security.validator.Validator.validate(Validator.java:260)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1496)
... 46 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382)
... 52 more
Metadata
Metadata
Assignees
Labels
No labels