Skip to content

Commit 455aa88

Browse files
committed
Addressed more comments
1 parent 029ef05 commit 455aa88

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

src/jekyll/running-on-kubernetes.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -292,13 +292,15 @@ the command may then look like the following:
292292

293293
### Configuring Kubernetes Roles and Service Accounts
294294

295+
In Kubernetes clusters with [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) enabled, users can configure Kubernetes RBAC roles and service accounts used by the various Spark on Kubernetes components to access the Kubernetes API server.
296+
295297
#### Driver
296298

297299
The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work.
298300

299-
Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles) [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the namespace specified by `--kubernetes-namespace`, if no service account is specified when the pod gets created.
301+
Specifically, at minimum, the service account must be granted a [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) that allows driver pods to create pods and services. By default, the driver pod is automatically assigned the `default` service account in the namespace specified by `--kubernetes-namespace`, if no service account is specified when the pod gets created.
300302

301-
Depending on the version and setup of Kubernetes deployed, this `default` service account may or may not have the `edit` role granted under the default Kubernetes [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property `spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod to use the `spark` service account, a user simply adds the following option to the `spark-submit` command:
303+
Depending on the version and setup of Kubernetes deployed, this `default` service account may or may not have the role that allows driver pods to create pods and services under the default Kubernetes [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property `spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod to use the `spark` service account, a user simply adds the following option to the `spark-submit` command:
302304

303305
```
304306
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
@@ -316,11 +318,11 @@ To grant a service account a `Role` or `ClusterRole`, a `RoleBinding` or `Cluste
316318
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
317319
```
318320

319-
Note that a `Role` can only be used to grant access to resources (like pods) within a single namespace, whereas a `ClusterRole` can be used to grant access to cluster-scoped resources (like nodes) as well as namespaced resources (like pods) across all namespaces. For Spark on Kubernetes, since the driver always creates executor pods in the same namespace, a `Role` is sufficient, although users may use a `ClusterRole` instead. For more information on RBAC authorization and how to configure Kubernetes service accounts for pods, please refer to [here](https://kubernetes.io/docs/admin/authorization/rbac/) and [here](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
321+
Note that a `Role` can only be used to grant access to resources (like pods) within a single namespace, whereas a `ClusterRole` can be used to grant access to cluster-scoped resources (like nodes) as well as namespaced resources (like pods) across all namespaces. For Spark on Kubernetes, since the driver always creates executor pods in the same namespace, a `Role` is sufficient, although users may use a `ClusterRole` instead. For more information on RBAC authorization and how to configure Kubernetes service accounts for pods, please refer to [Using RBAC Authorization](https://kubernetes.io/docs/admin/authorization/rbac/) and [Configure Service Accounts for Pods](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
320322

321323
#### Resource Staging Server
322324

323-
The Resource Staging Server (RSS) watches Spark driver pods to detect completed Spark applications so it knows when to safely delete resource bundles of the applications. When running as a pod in the same Kubernetes cluster as the Spark applications, by default (`spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` defaults to `true`), the RSS uses the default Kubernetes service account token located at `/var/run/secrets/kubernetes.io/serviceaccount/token` and the CA certificate located at `/var/run/secrets/kubernetes.io/serviceaccount/ca.crt`.
325+
The Resource Staging Server (RSS) watches Spark driver pods to detect completed Spark applications so it knows when to safely delete resource bundles of the applications. When running as a pod in the same Kubernetes cluster as the Spark applications, by default (`spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` defaults to `true`), the RSS uses the default Kubernetes service account token located at `/var/run/secrets/kubernetes.io/serviceaccount/token` and the CA certificate located at `/var/run/secrets/kubernetes.io/serviceaccount/ca.crt`. Note that the locations referred to here are both within the RSS pod and are used by Kubernetes by default.
324326

325327
When running outside the Kubernetes cluster or when `spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` is set to `false`, the credentials for authenticating with the Kubernetes API server can be specified using other configuration properties as documented in [Spark Properties](#spark-properties). Regardless of which credential is used, the credential must allow the RSS to view pods in any namespace.
326328

0 commit comments

Comments
 (0)