You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/jekyll/running-on-kubernetes.md
+13-3Lines changed: 13 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -294,7 +294,11 @@ the command may then look like the following:
294
294
295
295
#### Driver
296
296
297
-
The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work. Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles) [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the same namespace if no service account is specified when the pod gets created. Depending on the version and setup of Kubernetes deployed, this `default` service account may or may not have the `edit` role granted under the default Kubernetes [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property `spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod to use the `spark` service account, a user simply adds the following option to the `spark-submit` command:
297
+
The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work.
298
+
299
+
Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles)[`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the namespace specified by `--kubernetes-namespace`, if no service account is specified when the pod gets created.
300
+
301
+
Depending on the version and setup of Kubernetes deployed, this `default` service account may or may not have the `edit` role granted under the default Kubernetes [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property `spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod to use the `spark` service account, a user simply adds the following option to the `spark-submit` command:
@@ -316,13 +320,19 @@ Note that a `Role` can only be used to grant access to resources (like pods) wit
316
320
317
321
#### Resource Staging Server
318
322
319
-
The Resource Staging Server (RSS) watches Spark driver pods to detect completed Spark applications so it knows when to safely delete resource bundles of the applications. When running as a pod in the same Kubernetes cluster as the Spark applications, by default (`spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` defaults to `true`), the RSS uses the default Kubernetes service account token located at `/var/run/secrets/kubernetes.io/serviceaccount/token` and the CA certificate located at `/var/run/secrets/kubernetes.io/serviceaccount/ca.crt`. When running outside the Kubernetes cluster or when `spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` is set to `false`, the credentials for authenticating with the Kubernetes API server can be specified using other configuration properties as documented in [Spark Properties](#spark-properties). Regardless of which credential is used, the credential must allow the RSS to view pods in any namespace.
323
+
The Resource Staging Server (RSS) watches Spark driver pods to detect completed Spark applications so it knows when to safely delete resource bundles of the applications. When running as a pod in the same Kubernetes cluster as the Spark applications, by default (`spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` defaults to `true`), the RSS uses the default Kubernetes service account token located at `/var/run/secrets/kubernetes.io/serviceaccount/token` and the CA certificate located at `/var/run/secrets/kubernetes.io/serviceaccount/ca.crt`.
324
+
325
+
When running outside the Kubernetes cluster or when `spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` is set to `false`, the credentials for authenticating with the Kubernetes API server can be specified using other configuration properties as documented in [Spark Properties](#spark-properties). Regardless of which credential is used, the credential must allow the RSS to view pods in any namespace.
320
326
321
327
#### Shuffle Service
322
328
323
329
The shuffle service runs as a Kubernetes `DaemonSet`. Each pod of the shuffle service watches Spark driver pods so at minimum it needs a role that allows it to view pods. Additionally, the shuffle service uses a [`hostPath`](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath) volume for shuffle data. Writing to a `hostPath` volume requires either that the shuffle service process runs as root in a [privileged](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) container or that the user is able to modify the file permissions on the host to be able to write to a `hostPath` volume. Even in the first case, a pod may or may not be able to use a `hostPath` volume, depending on the types of volumes usable in the pod, which are controlled by `PodSecurityPolicy`.
324
330
325
-
In Kubernetes 1.5 and newer, one can use `PodSecurityPolicy` to control access to privileged containers based on user role and groups. To enable `hostPath` volume using a `PodSecurityPolicy`, a user needs to create a new or use an existing `PodSecurityPolicy` that has `hostPath` listed in the `.spec.volumes` field as this [example](https://github.com/kubernetes/examples/blob/master/staging/podsecuritypolicy/rbac/README.md#creating-the-policies-roles-and-bindings) shows. Then the user needs to create a `Role` (or a `ClusterRole` if necessary) that is allowed to `use` the `PodSecurityPolicy`. Finally, the user needs a `RoleBinding` (or `ClusterRoleBinding` in case of a `ClusterRole`) to grant the `Role` (or `ClusterRole`) to the service account used by the shuffle service pods. For more details on how to use `PodSecurityPolicy` and RBAC to control access to `PodSecurityPolicy`, please refer to this [doc](https://github.com/kubernetes/examples/blob/master/staging/podsecuritypolicy/rbac/README.md). To specify a custom service account for the shuffle service pods, add the following to the pod template in the shuffle service `DaemonSet` defined in `conf/kubernetes-shuffle-service.yaml`:
331
+
In Kubernetes 1.5 and newer, one can use `PodSecurityPolicy` to control access to privileged containers based on user role and groups. To enable `hostPath` volume using a `PodSecurityPolicy`, a user needs to create a new or use an existing `PodSecurityPolicy` that has `hostPath` listed in the `.spec.volumes` field as this [example](https://github.com/kubernetes/examples/blob/master/staging/podsecuritypolicy/rbac/README.md#creating-the-policies-roles-and-bindings) shows.
332
+
333
+
Then the user needs to create a `Role` (or a `ClusterRole` if necessary) that is allowed to `use` the `PodSecurityPolicy`. Finally, the user needs a `RoleBinding` (or `ClusterRoleBinding` in case of a `ClusterRole`) to grant the `Role` (or `ClusterRole`) to the service account used by the shuffle service pods. For more details on how to use `PodSecurityPolicy` and RBAC to control access to `PodSecurityPolicy`, please refer to this [doc](https://github.com/kubernetes/examples/blob/master/staging/podsecuritypolicy/rbac/README.md).
334
+
335
+
To specify a custom service account for the shuffle service pods, add the following to the pod template in the shuffle service `DaemonSet` defined in `conf/kubernetes-shuffle-service.yaml`:
0 commit comments