Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 15, 2016

What changes were proposed in this pull request?

This PR aims to improve DataSource option keys to be more case-insensitive

DataSource partially use CaseInsensitiveMap in code-path. For example, the following fails to find url.

val df = spark.createDataFrame(sparkContext.parallelize(arr2x2), schema2)
df.write.format("jdbc")
    .option("UrL", url1)
    .option("dbtable", "TEST.SAVETEST")
    .options(properties.asScala)
    .save()

This PR makes DataSource options to use CaseInsensitiveMap internally and also makes DataSource to use CaseInsensitiveMap generally except InMemoryFileIndex and InsertIntoHadoopFsRelationCommand. We can not pass them CaseInsensitiveMap because they creates new case-sensitive HadoopConfs by calling newHadoopConfWithOptions(options) inside.

How was this patch tested?

Pass the Jenkins test with newly added test cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just force users to pass a CaseInsensitiveMap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sure! You mean require(parameters.isInstanceOf[CaseInsensitiveMap]), right?

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just changing the function signature. parameters: CaseInsensitiveMap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this PR in that way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSONOptions is spark private, we can just declare the type is CaseInsensitiveMap, i.e. parameters: CaseInsensitiveMap

@SparkQA
Copy link

SparkQA commented Nov 15, 2016

Test build #68642 has finished for PR 15884 at commit 30eff08.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class CaseInsensitiveMap(map: Map[String, String]) extends Map[String, String]

@SparkQA
Copy link

SparkQA commented Nov 15, 2016

Test build #68654 has finished for PR 15884 at commit 937db99.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class FileStreamOptions(parameters: CaseInsensitiveMap) extends Logging

@SparkQA
Copy link

SparkQA commented Nov 15, 2016

Test build #68655 has finished for PR 15884 at commit e69735c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class FileStreamOptions(parameters: CaseInsensitiveMap) extends Logging

@dongjoon-hyun dongjoon-hyun changed the title [WIP][SPARK-18433][SQL] Improve DataSource option keys to be more case-insensitive [SPARK-18433][SQL] Improve DataSource option keys to be more case-insensitive Nov 15, 2016
@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan .
It becomes much better. Could you review again please?

@dongjoon-hyun
Copy link
Member Author

Rebased to resolve conflicts.

test("SPARK-6245 JsonRDD.inferSchema on empty RDD") {
// This is really a test that it doesn't throw an exception
val emptySchema = InferSchema.infer(empty, "", new JSONOptions(Map()))
val emptySchema = InferSchema.infer(empty, "", new JSONOptions(Map.empty[String, String]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change?

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we expect Map[String,String], but Map() or Map.empty is Map[Nothing,Nothing].
So, there was some compilation issue to find constructor of JSONOptions.

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Nov 16, 2016

Test build #68695 has finished for PR 15884 at commit 6570665.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class FileStreamOptions(parameters: CaseInsensitiveMap) extends Logging

asfgit pushed a commit that referenced this pull request Nov 16, 2016
…ensitive

## What changes were proposed in this pull request?

This PR aims to improve DataSource option keys to be more case-insensitive

DataSource partially use CaseInsensitiveMap in code-path. For example, the following fails to find url.

```scala
val df = spark.createDataFrame(sparkContext.parallelize(arr2x2), schema2)
df.write.format("jdbc")
    .option("UrL", url1)
    .option("dbtable", "TEST.SAVETEST")
    .options(properties.asScala)
    .save()
```

This PR makes DataSource options to use CaseInsensitiveMap internally and also makes DataSource to use CaseInsensitiveMap generally except `InMemoryFileIndex` and `InsertIntoHadoopFsRelationCommand`. We can not pass them CaseInsensitiveMap because they creates new case-sensitive HadoopConfs by calling newHadoopConfWithOptions(options) inside.

## How was this patch tested?

Pass the Jenkins test with newly added test cases.

Author: Dongjoon Hyun <[email protected]>

Closes #15884 from dongjoon-hyun/SPARK-18433.

(cherry picked from commit 74f5c21)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.1!

@asfgit asfgit closed this in 74f5c21 Nov 16, 2016
@dongjoon-hyun
Copy link
Member Author

Thank you so much, @cloud-fan ! Also, thank you for this issue, @gatorsmile .

@dongjoon-hyun dongjoon-hyun deleted the SPARK-18433 branch November 16, 2016 19:35
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…ensitive

## What changes were proposed in this pull request?

This PR aims to improve DataSource option keys to be more case-insensitive

DataSource partially use CaseInsensitiveMap in code-path. For example, the following fails to find url.

```scala
val df = spark.createDataFrame(sparkContext.parallelize(arr2x2), schema2)
df.write.format("jdbc")
    .option("UrL", url1)
    .option("dbtable", "TEST.SAVETEST")
    .options(properties.asScala)
    .save()
```

This PR makes DataSource options to use CaseInsensitiveMap internally and also makes DataSource to use CaseInsensitiveMap generally except `InMemoryFileIndex` and `InsertIntoHadoopFsRelationCommand`. We can not pass them CaseInsensitiveMap because they creates new case-sensitive HadoopConfs by calling newHadoopConfWithOptions(options) inside.

## How was this patch tested?

Pass the Jenkins test with newly added test cases.

Author: Dongjoon Hyun <[email protected]>

Closes apache#15884 from dongjoon-hyun/SPARK-18433.
asfgit pushed a commit that referenced this pull request Jan 30, 2017
### What changes were proposed in this pull request?
The case are not sensitive in JDBC options, after the PR #15884 is merged to Spark 2.1.

### How was this patch tested?
N/A

Author: gatorsmile <[email protected]>

Closes #16734 from gatorsmile/fixDocCaseInsensitive.

(cherry picked from commit c0eda7e)
Signed-off-by: gatorsmile <[email protected]>
ghost pushed a commit to dbtsai/spark that referenced this pull request Jan 30, 2017
### What changes were proposed in this pull request?
The case are not sensitive in JDBC options, after the PR apache#15884 is merged to Spark 2.1.

### How was this patch tested?
N/A

Author: gatorsmile <[email protected]>

Closes apache#16734 from gatorsmile/fixDocCaseInsensitive.
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
### What changes were proposed in this pull request?
The case are not sensitive in JDBC options, after the PR apache#15884 is merged to Spark 2.1.

### How was this patch tested?
N/A

Author: gatorsmile <[email protected]>

Closes apache#16734 from gatorsmile/fixDocCaseInsensitive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants