Skip to content

Conversation

@winningsix
Copy link

@winningsix winningsix commented Sep 23, 2015

This patch is using Apache Commons Crypto library to enable shuffle encryption support.

@vanzin
Copy link
Contributor

vanzin commented Sep 23, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Sep 23, 2015

Test build #42941 has finished for PR 8880 at commit 4a95c73.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class AesCtrCryptoCodec extends CryptoCodec
    • case class CipherSuite(name: String, algoBlockSize: Int)
    • abstract case class CryptoCodec()
    • class CryptoInputStream(in: InputStream, codecVal: CryptoCodec,
    • class CryptoOutputStream(out: OutputStream, codecVal: CryptoCodec, bufferSizeVal: Int,
    • trait Decryptor
    • trait Encryptor
    • class JceAesCtrCryptoCodec(conf: SparkConf) extends AesCtrCryptoCodec with Logging
    • class JceAesCtrCipher(mode: Int, provider: String) extends Encryptor with Decryptor

@rxin
Copy link
Contributor

rxin commented Sep 24, 2015

Can you please add some high level documentation about this change?

@winningsix
Copy link
Author

@rxin, the design document is available in https://issues.apache.org/jira/secure/attachment/12730704/Design%20Document%20of%20Encrypted%20Spark%20Shuffle_20150506.docx which is an attachment of SPARK-5682. This PR is trying to add the basic shuffle encryption workflow and JCE key provider support. Next step we will add openssl crypto codec to improve the encryption performance (~17x expected with AES-NI enabled).

@SparkQA
Copy link

SparkQA commented Sep 24, 2015

Test build #42953 timed out for PR 8880 at commit 67b391e after a configured wait of 250m.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to import Byte

@SparkQA
Copy link

SparkQA commented Aug 25, 2016

Test build #64408 has finished for PR 8880 at commit 9f958a4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if (master == "yarn" && deployMode == "client") System.setProperty("SPARK_YARN_MODE", "true")
if (_conf.get(IO_ENCRYPTION_ENABLED) && !SparkHadoopUtil.get.isYarnMode()) {
throw new SparkException("IO encryption is only supported in YARN mode, please disable it " +
"by setting spark.io.encryption.enabled to false")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use ${IO_ENCRYPTION_ENABLED.key} instead of the hardcoded key name.

def toCryptoConf(
conf: SparkConf,
sparkPrefix: String,
cryptoPrefix: String): Properties = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you don't need sparkPrefix and cryptoPrefix any more.

@zsxwing
Copy link
Member

zsxwing commented Aug 25, 2016

Looks pretty overall. Just a high level question: Why needs to generate a new key for IO encryption? Can we just use SecurityManager.getSecretKey?

@vanzin
Copy link
Contributor

vanzin commented Aug 25, 2016

Why needs to generate a new key for IO encryption?

They could be different sizes (different config options control that). We could change it so both use the same configs / same code to generate the keys, but in general if they're used for different things I prefer to have different configs, at least.

<td><code>spark.io.encryption.enabled</code></td>
<td>false</td>
<td>
Enable IO encryption.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please say it only supports Yarn mode here.

@zsxwing
Copy link
Member

zsxwing commented Aug 25, 2016

They could be different sizes (different config options control that). We could change it so both use the same configs / same code to generate the keys, but in general if they're used for different things I prefer to have different configs, at least.

Sounds good to me

@SparkQA
Copy link

SparkQA commented Aug 26, 2016

Test build #64451 has finished for PR 8880 at commit a9a05c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

<td><code>spark.io.encryption.enabled</code></td>
<td>false</td>
<td>
Enable IO encryption. It only supports YARN mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Only supported in YARN mode."


val encryptedBytes = outputStream.toByteArray
val encryptedStr = new String(encryptedBytes)
assert (plainStr !== encryptedStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no space before (

@vanzin
Copy link
Contributor

vanzin commented Aug 29, 2016

LGTM. Any remaining comments @zsxwing ?

@SparkQA
Copy link

SparkQA commented Aug 29, 2016

Test build #64582 has finished for PR 8880 at commit a811bf4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Aug 30, 2016

LGTM

@SparkQA
Copy link

SparkQA commented Aug 30, 2016

Test build #64614 has finished for PR 8880 at commit 928a59b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Aug 30, 2016

Merging to master.

@asfgit asfgit closed this in 4b4e329 Aug 30, 2016
winningsix pushed a commit to winningsix/spark that referenced this pull request Sep 9, 2016
This patch is using Apache Commons Crypto library to enable shuffle encryption support.

Author: Ferdinand Xu <[email protected]>
Author: kellyzly <[email protected]>

Closes apache#8880 from winningsix/SPARK-10771.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants