-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5682][Core] Add encrypted shuffle in spark #8880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
|
Test build #42941 has finished for PR 8880 at commit
|
|
Can you please add some high level documentation about this change? |
4a95c73 to
67b391e
Compare
|
@rxin, the design document is available in https://issues.apache.org/jira/secure/attachment/12730704/Design%20Document%20of%20Encrypted%20Spark%20Shuffle_20150506.docx which is an attachment of SPARK-5682. This PR is trying to add the basic shuffle encryption workflow and JCE key provider support. Next step we will add openssl crypto codec to improve the encryption performance (~17x expected with AES-NI enabled). |
|
Test build #42953 timed out for PR 8880 at commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to import Byte
|
Test build #64408 has finished for PR 8880 at commit
|
| if (master == "yarn" && deployMode == "client") System.setProperty("SPARK_YARN_MODE", "true") | ||
| if (_conf.get(IO_ENCRYPTION_ENABLED) && !SparkHadoopUtil.get.isYarnMode()) { | ||
| throw new SparkException("IO encryption is only supported in YARN mode, please disable it " + | ||
| "by setting spark.io.encryption.enabled to false") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use ${IO_ENCRYPTION_ENABLED.key} instead of the hardcoded key name.
| def toCryptoConf( | ||
| conf: SparkConf, | ||
| sparkPrefix: String, | ||
| cryptoPrefix: String): Properties = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you don't need sparkPrefix and cryptoPrefix any more.
|
Looks pretty overall. Just a high level question: Why needs to generate a new key for IO encryption? Can we just use |
They could be different sizes (different config options control that). We could change it so both use the same configs / same code to generate the keys, but in general if they're used for different things I prefer to have different configs, at least. |
docs/configuration.md
Outdated
| <td><code>spark.io.encryption.enabled</code></td> | ||
| <td>false</td> | ||
| <td> | ||
| Enable IO encryption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please say it only supports Yarn mode here.
Sounds good to me |
|
Test build #64451 has finished for PR 8880 at commit
|
docs/configuration.md
Outdated
| <td><code>spark.io.encryption.enabled</code></td> | ||
| <td>false</td> | ||
| <td> | ||
| Enable IO encryption. It only supports YARN mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Only supported in YARN mode."
|
|
||
| val encryptedBytes = outputStream.toByteArray | ||
| val encryptedStr = new String(encryptedBytes) | ||
| assert (plainStr !== encryptedStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no space before (
|
LGTM. Any remaining comments @zsxwing ? |
|
Test build #64582 has finished for PR 8880 at commit
|
|
LGTM |
|
Test build #64614 has finished for PR 8880 at commit
|
|
Merging to master. |
This patch is using Apache Commons Crypto library to enable shuffle encryption support. Author: Ferdinand Xu <[email protected]> Author: kellyzly <[email protected]> Closes apache#8880 from winningsix/SPARK-10771.
This patch is using Apache Commons Crypto library to enable shuffle encryption support.