Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -1303,6 +1303,22 @@ Apart from these, the following properties are also available, and may be useful
not running on YARN and authentication is enabled.
</td>
</tr>
<tr>
<td><code>spark.authenticate.enableSaslEncryption</code></td>
<td>false</td>
<td>
Enable encrypted communication when authentication is enabled. This option is currently
only supported by the block transfer service.
</td>
</tr>
<tr>
<td><code>spark.network.sasl.serverAlwaysEncrypt</code></td>
<td>false</td>
<td>
Disable unencrypted connections for services that support SASL authentication. This is
currently supported by the external shuffle service.
</td>
</tr>
<tr>
<td><code>spark.core.connection.ack.wait.timeout</code></td>
<td>60s</td>
Expand Down
22 changes: 20 additions & 2 deletions docs/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,16 @@ If your applications are using event logging, the directory where the event logs

## Encryption

Spark supports SSL for Akka and HTTP (for broadcast and file server) protocols. However SSL is not supported yet for WebUI and block transfer service.
Spark supports SSL for Akka and HTTP (for broadcast and file server) protocols. SASL encryption is
supported for the block transfer service. Encryption is not yet supported for the WebUI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also explicitly say that encryption is not yet supported for shuffle data at rest (e.g. in spill files, shuffle files, data cached in memory or on-disk, etc)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll add something.


Connection encryption (SSL) configuration is organized hierarchically. The user can configure the default SSL settings which will be used for all the supported communication protocols unless they are overwritten by protocol-specific settings. This way the user can easily provide the common settings for all the protocols without disabling the ability to configure each one individually. The common SSL settings are at `spark.ssl` namespace in Spark configuration, while Akka SSL configuration is at `spark.ssl.akka` and HTTP for broadcast and file server SSL configuration is at `spark.ssl.fs`. The full breakdown can be found on the [configuration page](configuration.html).
Encryption is not yet supported for data stored by Spark in temporary local storage, such as shuffle
files, cached data, and other application files. If encrypting this data is desired, a workaround is
to configure your cluster manager to store application data on encrypted disks.

### SSL Configuration

Configuration for SSL is organized hierarchically. The user can configure the default SSL settings which will be used for all the supported communication protocols unless they are overwritten by protocol-specific settings. This way the user can easily provide the common settings for all the protocols without disabling the ability to configure each one individually. The common SSL settings are at `spark.ssl` namespace in Spark configuration, while Akka SSL configuration is at `spark.ssl.akka` and HTTP for broadcast and file server SSL configuration is at `spark.ssl.fs`. The full breakdown can be found on the [configuration page](configuration.html).

SSL must be configured on each node and configured for each component involved in communication using the particular protocol.

Expand All @@ -47,6 +54,17 @@ follows:
* Import all exported public keys into a single trust-store
* Distribute the trust-store over the nodes

### Configuring SASL Encryption

SASL encryption is currently supported for the block transfer service when authentication
(`spark.authenticate`) is enabled. To enable SASL encryption for an application, set
`spark.authenticate.enableSaslEncryption` to `true` in the application's configuration.

When using an external shuffle service, it's possible to disable unencrypted connections by setting
`spark.network.sasl.serverAlwaysEncrypt` to `true` in the shuffle service's configuration. If that
option is enabled, applications that are not set up to use SASL encryption will fail to connect to
the shuffle service.

## Configuring Ports for Network Security

Spark makes heavy use of the network, and some environments have strict requirements for using tight
Expand Down