-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-18433][SQL] Improve DataSource option keys to be more case-insensitive #15884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just force users to pass a CaseInsensitiveMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sure! You mean require(parameters.isInstanceOf[CaseInsensitiveMap]), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, just changing the function signature. parameters: CaseInsensitiveMap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update this PR in that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSONOptions is spark private, we can just declare the type is CaseInsensitiveMap, i.e. parameters: CaseInsensitiveMap
|
Test build #68642 has finished for PR 15884 at commit
|
|
Test build #68654 has finished for PR 15884 at commit
|
|
Test build #68655 has finished for PR 15884 at commit
|
|
Thank you, @cloud-fan . |
|
Rebased to resolve conflicts. |
| test("SPARK-6245 JsonRDD.inferSchema on empty RDD") { | ||
| // This is really a test that it doesn't throw an exception | ||
| val emptySchema = InferSchema.infer(empty, "", new JSONOptions(Map())) | ||
| val emptySchema = InferSchema.infer(empty, "", new JSONOptions(Map.empty[String, String])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we expect Map[String,String], but Map() or Map.empty is Map[Nothing,Nothing].
So, there was some compilation issue to find constructor of JSONOptions.
|
LGTM |
|
Test build #68695 has finished for PR 15884 at commit
|
…ensitive
## What changes were proposed in this pull request?
This PR aims to improve DataSource option keys to be more case-insensitive
DataSource partially use CaseInsensitiveMap in code-path. For example, the following fails to find url.
```scala
val df = spark.createDataFrame(sparkContext.parallelize(arr2x2), schema2)
df.write.format("jdbc")
.option("UrL", url1)
.option("dbtable", "TEST.SAVETEST")
.options(properties.asScala)
.save()
```
This PR makes DataSource options to use CaseInsensitiveMap internally and also makes DataSource to use CaseInsensitiveMap generally except `InMemoryFileIndex` and `InsertIntoHadoopFsRelationCommand`. We can not pass them CaseInsensitiveMap because they creates new case-sensitive HadoopConfs by calling newHadoopConfWithOptions(options) inside.
## How was this patch tested?
Pass the Jenkins test with newly added test cases.
Author: Dongjoon Hyun <[email protected]>
Closes #15884 from dongjoon-hyun/SPARK-18433.
(cherry picked from commit 74f5c21)
Signed-off-by: Wenchen Fan <[email protected]>
|
thanks, merging to master/2.1! |
|
Thank you so much, @cloud-fan ! Also, thank you for this issue, @gatorsmile . |
…ensitive
## What changes were proposed in this pull request?
This PR aims to improve DataSource option keys to be more case-insensitive
DataSource partially use CaseInsensitiveMap in code-path. For example, the following fails to find url.
```scala
val df = spark.createDataFrame(sparkContext.parallelize(arr2x2), schema2)
df.write.format("jdbc")
.option("UrL", url1)
.option("dbtable", "TEST.SAVETEST")
.options(properties.asScala)
.save()
```
This PR makes DataSource options to use CaseInsensitiveMap internally and also makes DataSource to use CaseInsensitiveMap generally except `InMemoryFileIndex` and `InsertIntoHadoopFsRelationCommand`. We can not pass them CaseInsensitiveMap because they creates new case-sensitive HadoopConfs by calling newHadoopConfWithOptions(options) inside.
## How was this patch tested?
Pass the Jenkins test with newly added test cases.
Author: Dongjoon Hyun <[email protected]>
Closes apache#15884 from dongjoon-hyun/SPARK-18433.
### What changes were proposed in this pull request? The case are not sensitive in JDBC options, after the PR #15884 is merged to Spark 2.1. ### How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #16734 from gatorsmile/fixDocCaseInsensitive. (cherry picked from commit c0eda7e) Signed-off-by: gatorsmile <[email protected]>
### What changes were proposed in this pull request? The case are not sensitive in JDBC options, after the PR apache#15884 is merged to Spark 2.1. ### How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes apache#16734 from gatorsmile/fixDocCaseInsensitive.
### What changes were proposed in this pull request? The case are not sensitive in JDBC options, after the PR apache#15884 is merged to Spark 2.1. ### How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes apache#16734 from gatorsmile/fixDocCaseInsensitive.
What changes were proposed in this pull request?
This PR aims to improve DataSource option keys to be more case-insensitive
DataSource partially use CaseInsensitiveMap in code-path. For example, the following fails to find url.
This PR makes DataSource options to use CaseInsensitiveMap internally and also makes DataSource to use CaseInsensitiveMap generally except
InMemoryFileIndexandInsertIntoHadoopFsRelationCommand. We can not pass them CaseInsensitiveMap because they creates new case-sensitive HadoopConfs by calling newHadoopConfWithOptions(options) inside.How was this patch tested?
Pass the Jenkins test with newly added test cases.