-
Notifications
You must be signed in to change notification settings - Fork 150
Open
Description
Hi everybody,
as @ryanlovett asked me I opened this issue here, related to jupyterhub/zero-to-jupyterhub-k8s#1030.
The Problem is as following:
After starting PySpark I am not able to access the Spark UI, resulting in a Jupyterhub 404 error.
Here are (hopefully) all the relevant Information:
- I create a new user image from the from the jupyter/pyspark image
- The Dockerfile for this image contains:
FROM jupyter/pyspark-notebook:5b2160dfd919
RUN pip install nbserverproxy
RUN jupyter serverextension enable --py nbserverproxy
USER root
RUN echo “$NB_USER ALL=(ALL) NOPASSWD:ALL” > /etc/sudoers.d/notebook
USER $NB_USER
- I create the
SparkContext()in the pod, created with respective image which gives me the link to the UI. - The
SparkContext()is created with the following config:
conf.setMaster('k8s://https://'+ os.environ['KUBERNETES_SERVICE_HOST'] +':443')
conf.set('spark.kubernetes.container.image', 'idalab/spark-py:spark')
conf.set('spark.submit.deployMode', 'client')
conf.set('spark.executor.instances', '2')
conf.setAppName('pyspark-shell')
conf.set('spark.driver.host', '10.16.205.42 ')
os.environ['PYSPARK_PYTHON'] = 'python3'
os.environ['PYSPARK_DRIVER_PYTHON'] = 'python3'
- The link created by Spark is obviously not accessible on the hub as it points to
<POD_IP>:4040 - I try to access the UI via
.../username/proxy/4040and.../username/proxy/4040/both don't work and lead to a Jupyterhub 404. - Other ports are accessible via this method so I assume nbserverextension is working correctly.
- This is the output of
npnetstat -pl:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 localhost:54695 0.0.0.0:* LISTEN 23/python
tcp 0 0 localhost:33896 0.0.0.0:* LISTEN 23/python
tcp 0 0 localhost:34577 0.0.0.0:* LISTEN 23/python
tcp 0 0 localhost:52211 0.0.0.0:* LISTEN 23/python
tcp 0 0 0.0.0.0:8888 0.0.0.0:* LISTEN 7/python
tcp 0 0 localhost:39388 0.0.0.0:* LISTEN 23/python
tcp 0 0 localhost:39971 0.0.0.0:* LISTEN 23/python
tcp 0 0 localhost:32867 0.0.0.0:* LISTEN 23/python
tcp6 0 0 jupyter-hagen:43878 [::]:* LISTEN 45/java
tcp6 0 0 [::]:4040 [::]:* LISTEN 45/java
tcp6 0 0 localhost:32816 [::]:* LISTEN 45/java
tcp6 0 0 jupyter-hagen:41793 [::]:* LISTEN 45/java
One can see that the java processes have another format due to tcp6
-
To check if this is the error I set the environment variable
'_JAVA_OPTIONS'set to"-Djava.net.preferIPv4Stack=true". -
This results in the following output but does not resolve the problem:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 localhost:54695 0.0.0.0:* LISTEN 456/python
tcp 0 0 0.0.0.0:4040 0.0.0.0:* LISTEN 475/java
tcp 0 0 localhost:33896 0.0.0.0:* LISTEN 456/python
tcp 0 0 localhost:34990 0.0.0.0:* LISTEN 475/java
tcp 0 0 localhost:36079 0.0.0.0:* LISTEN 456/python
tcp 0 0 jupyter-hagen:35119 0.0.0.0:* LISTEN 475/java
tcp 0 0 localhost:34577 0.0.0.0:* LISTEN 456/python
tcp 0 0 jupyter-hagen:42195 0.0.0.0:* LISTEN 475/java
tcp 0 0 localhost:34836 0.0.0.0:* LISTEN 456/python
tcp 0 0 0.0.0.0:8888 0.0.0.0:* LISTEN 7/python
tcp 0 0 localhost:39971 0.0.0.0:* LISTEN 456/python
tcp 0 0 localhost:32867 0.0.0.0:* LISTEN 456/python
- I checked, whether the UI is generally accessible by running a local version of the user image on my PC and forwarding the port. That works fine!
My user image is available on docker hub at idalab/spark-user:1.0.2 so this should be easy to inject for debugging if neccessary.
Thank you for your help.
Metadata
Metadata
Assignees
Labels
No labels