Skip to content

Commit 44f8766

Browse files
committed
[SPARK-48930][CORE] Redact awsAccessKeyId by including accesskey pattern
### What changes were proposed in this pull request? This PR aims to redact `awsAccessKeyId` by including `accesskey` pattern. - **Apache Spark 4.0.0-preview1** There is no point to redact `fs.s3a.access.key` because the same value is exposed via `fs.s3.awsAccessKeyId` like the following. We need to redact all. ``` $ AWS_ACCESS_KEY_ID=A AWS_SECRET_ACCESS_KEY=B bin/spark-shell ``` ![Screenshot 2024-07-17 at 12 45 44](https://github.com/user-attachments/assets/e3040c5d-3eb9-4944-a6d6-5179b7647426) ### Why are the changes needed? Since Apache Spark 1.1.0, `AWS_ACCESS_KEY_ID` is propagated like the following. However, Apache Spark does not redact them all consistently. - #450 https://github.com/apache/spark/blob/5d16c3134c442a5546251fd7c42b1da9fdf3969e/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L481-L486 ### Does this PR introduce _any_ user-facing change? Users may see more redactions on configurations whose name contains `accesskey` case-insensitively. However, those configurations are highly likely to be related to the credentials. ### How was this patch tested? Pass the CIs with the newly added test cases. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47392 from dongjoon-hyun/SPARK-48930. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 1e17c39) Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 443825a commit 44f8766

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

core/src/main/scala/org/apache/spark/internal/config/package.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1155,7 +1155,7 @@ package object config {
11551155
"like YARN and event logs.")
11561156
.version("2.1.2")
11571157
.regexConf
1158-
.createWithDefault("(?i)secret|password|token|access[.]key".r)
1158+
.createWithDefault("(?i)secret|password|token|access[.]?key".r)
11591159

11601160
private[spark] val STRING_REDACTION_PATTERN =
11611161
ConfigBuilder("spark.redaction.string.regex")

core/src/test/scala/org/apache/spark/util/UtilsSuite.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1093,6 +1093,7 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
10931093
// Set some secret keys
10941094
val secretKeys = Seq(
10951095
"spark.executorEnv.HADOOP_CREDSTORE_PASSWORD",
1096+
"spark.hadoop.fs.s3.awsAccessKeyId",
10961097
"spark.hadoop.fs.s3a.access.key",
10971098
"spark.my.password",
10981099
"spark.my.sECreT")

0 commit comments

Comments
 (0)