Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Conversation

@mccheah
Copy link

@mccheah mccheah commented Feb 15, 2017

Closes #70.

Copy link
Author

@mccheah mccheah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@foxish The SparkUI changes are not intuitive. There might be a better way to go about fixing the UI.

Normally, when using any other deploy mode, the Spark UI is hosted on the driver with an empty base path (that is, /). However when using the Ingress, the full path that is sent to the proxy may also be what is requested of the driver service. This means that the driver service will always be contacted at a base path of http://<service-ip>:<service-port>/<spark-kubernetes-id>/<spark-ui>/. Unfortunately, the prior implementation of the Spark UI expects to only be contacted at http://<service-ip>:<service-port>.

There's generally some questions about how to handle all the various ways the Ingress controller may be set up in production. Think SSL, etc.

(null, SSLContext.getDefault)
}
val maxRetriesPerServer = if (useIngress) {
SUBMISSION_CLIENT_RETRIES_INGRESS
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@foxish is there perhaps a better way to do this? It turns out that the Ingress resource takes longer to become ready than the NodePort endpoints. Furthermore, the Watch on the Ingress doesn't necessarily catch whether or not the Ingress has added the endpoint.

@mccheah
Copy link
Author

mccheah commented Feb 15, 2017

@ash211

val defaultErrorDecoder = new ErrorDecoder.Default
val alwaysRetryErrorDecoder = new ErrorDecoder {
override def decode(methodKey: String, response: Response): Exception = {
defaultErrorDecoder.decode(methodKey, response) match {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ingress code path may return errors that are 404s when the load balancer (Nginx in the case of Minikube) has not set up the proxy yet. These kinds of errors aren't considered retryable by default.

<td><code>spark.kubernetes.driver.exposeIngress</code></td>
<td>false</td>
<td>
When initially contacting the driver, use an Ingress when the submitting client passes application dependencies to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo 'UI.me'

"changeit")
keyStoreFile = keyStore
trustStoreFile = trustStore
minikubeKubernetesClient.services().inNamespace("kube-system").createOrReplaceWithNew()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally create non-system components in the kube-system namespace but I guess it is okay in the test.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, is there a way for a service to proxy to pods that are not in its namespace?

.endMetadata()
.withNewSpec()
.withType("NodePort")
.withType(if (useIngress) "ClusterIP" else "NodePort")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works with some but not all ingress. See https://github.com/kubernetes/ingress/blob/master/docs/faq/gce.md#i-created-an-ingress-and-nothing-happens-now-what, kubernetes/kubernetes#26508

I think we'd want the user to get a cluster-ip type here, with the default still being NodePort.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to clarify, we should use NodePort to catch all of the use cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nodeport would definitely capture some use cases. But folks may have on-prem ingress controllers which use ClusterIP. I'm thinking we should support both of these.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh... how would we support both of these? Would we need to make this setting configurable?

Copy link
Member

@foxish foxish Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the service type configurable on its own seems like a good idea, considering #129 also. That way, we could isolate the concerns of having/not having an ingress, and the service type being used. It would be up to the user to set a service type appropriately (nodeport/clusterIP) if they are using an ingress. Do you have ideas of how we can manage the complexity from the proliferation of commandline options configuring services? I'm not so sure that there is one setting that fits a majority of the use cases.

.booleanConf
.createWithDefault(false)

private[spark] val INGRESS_BASE_PATH =
Copy link
Member

@foxish foxish Feb 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be the external hostname/IP address?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so. I was wondering if we could rely on the LoadBalancerStatus we get back from querying API server for the Ingress status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to do that. I don't know how much we can count on different ingress controllers behaving the same way. I think it is very useful to have ingress support though. I'm swamped today, I'll get back to this item tomorrow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave the base path as hardcoded and configurable then, at least for now?

<tr>
<td><code>spark.kubernetes.driver.ingressBasePath</code></td>
<td>(none)</td>
<td>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we give an example here? Otherwise the name ingressBasePath would make people tend to think it's only a path and not something like [host]:[port]

sparkConf.setIfMissing("spark.driver.port", DEFAULT_DRIVER_PORT.toString)
sparkConf.setIfMissing("spark.blockmanager.port",
DEFAULT_BLOCKMANAGER_PORT.toString)
sparkConf.set("spark.ui.basePath", s"/$kubernetesAppId/$UI_PATH_COMPONENT")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should only be set when exposeIngress is true, otherwise it would affect the current non-ingress deploy.

I just ran the integration tests and it failed at the first case which don't have exposeIngress set, and the test log shows the spark api returns 404.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised the test failed - I changed KubernetesSuite#getSparkMetricsService in anticipation of the change in the base path. The tests pass locally for me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it the more I think the base path should be consistent between Ingress mode and non-Ingress mode. It would be nice to avoid a base path entirely, but unfortunately the Ingress will take the path "as-is" and forward that exact path to the backend by default. Some ingress controllers support path rewriting, but not all Ingress controllers would indeed support that so we have to go with the lowest common denominator, unfortunately.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised the test failed - I changed KubernetesSuite#getSparkMetricsService in anticipation of the change in the base path. The tests pass locally for me.

OK, I'll take another test run later.

def validate(): KubernetesSparkRestServerArguments = {
require(host.isDefined, "Hostname not set via --hostname.")
require(port.isDefined, "Port not set via --port")
require(basePath.isDefined, "Base path not defined via --base-path")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--base-path should also be optional based on if ingress is used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want the base path to be consistent for both this and the Spark UI regardless if the Ingress is used or not. That reduces the number of end results that would come up depending on the configuration. For example someone may migrate their application to submit through Ingress instead of previously submitting through NodePort and they shouldn't have to rewrite the base path for their Spark UI.

@mccheah
Copy link
Author

mccheah commented Feb 21, 2017

@foxish thoughts on the change overall, and some of my follow-up remarks?

@foxish
Copy link
Member

foxish commented Feb 21, 2017

I can't get this to work on GKE or GCE yet, because of reasons I'm still investigating. I think the expectation that spark-on-k8s will seamlessly work with ingress may lead to a lot of inconsistent experiences, mainly because of how ingress is handled a little differently in different clusters. I do not think the user should be providing the "base path" and we should derive it from the the resource.

A different issue: Setting the owner references also seems to not be happening, although the code does attempt it.

@foxish
Copy link
Member

foxish commented Feb 22, 2017

Another thing to note: The kubernetes.io/ingress.class needs to be set to target a particular ingress controller when there are multiple ingress controllers running.

@mccheah
Copy link
Author

mccheah commented Feb 23, 2017

I'm closing this based on discussion of an alternate approach from #70.

@mccheah mccheah closed this Feb 23, 2017
@ash211 ash211 deleted the ingress-support branch February 23, 2017 21:20
ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants