[WIP][SPARK-10849][SQL] Adds a new column metadata property to the jdbc data source for users to specify database column type using the metadata #16208

sureshthalamati · 2016-12-08T06:48:28Z

What changes were proposed in this pull request?

Currently JDBC data source creates tables in the target database using the default type mapping, and the JDBC dialect mechanism. If users want to specify different database data type for only some of columns, there is no option available. In scenarios where default mapping does not work, users are forced to create tables on the target database before writing. This workaround is probably not acceptable from a usability point of view. This PR is to provide a user-defined type mapping for specific columns.

The solution is based on the existing Redshift connector (https://github.com/databricks/spark-redshift#setting-a-custom-column-type). We add a new column metadata property to the jdbc data source for users to specify database column type using the metadata.

Example :

val nvarcharMd = new MetadataBuilder().putString(“createTableColumnType", "NVARCHAR(123)").build()
val newDf = df.withColumn("name", col("name"), nvarcharMd)
newDf.write.mode(SaveMode.Overwrite).jdbc(url, "TEST.USERDBTYPETEST", properties)

One restriction with this approach metadata modification is unsupported in the Python, SQL, and R language APIs. Users have to create a new data frame to specify the metadata with the createTableColumnType property.

Alternative approach is to add JDBC data source option for users to specify database column types information as JSON String. For more details , please refer to the JDBC OPTION PR.

TODO: Documentation for specifying the database column type

How was this patch tested?

Added new test case to the JDBCWriteSuite

SparkQA · 2016-12-08T06:52:41Z

Test build #69849 has started for PR 16208 at commit 3834903.

gatorsmile · 2016-12-08T08:03:30Z

@rxin @JoshRosen @srowen Which solution is preferred for supporting customized column types? Table-level JDBC option or column metadata property? Thanks!

FYI: This PR is based on column metadata property.

gatorsmile · 2016-12-08T22:25:20Z

retest this please

SparkQA · 2016-12-09T00:43:52Z

Test build #69883 has finished for PR 16208 at commit 3834903.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mtrewartha · 2017-01-12T22:16:44Z

@sureshthalamati Are you still planning on trying to get this merged soon? This would be a hugely useful feature for us!

sureshthalamati · 2017-01-12T23:41:42Z

@miketrewartha Yes, i am hoping one of the fixes for this issues will get merged. I proposed two solutions this PR , and another one #16209. Waiting for feedback from committers.

gatorsmile · 2017-01-13T02:08:41Z

@sureshthalamati Could you resolve the conflicts in both PRs? Thanks!

… users to specify database column type when creating table on write.

SparkQA · 2017-01-15T22:30:36Z

Test build #71404 has finished for PR 16208 at commit 66c9e80.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-03-24T00:45:59Z

Can you please close it, since the alternative PR is already merged?

sureshthalamati · 2017-03-24T04:38:54Z

This issue is resolved by the alternate PR. Closing the PR.

sureshthalamati mentioned this pull request Dec 8, 2016

[SPARK-10849][SQL] Adds option to the JDBC data source write for user to specify database column type for the create table #16209

Closed

sureshthalamati mentioned this pull request Dec 8, 2016

[SPARK-10849][SQL] Adding field metadata property to override default jdbc data source type mapping. #9352

Closed

[SPARK-10849][SQL} Add new jdbc datasource metadata property to allow…

66c9e80

… users to specify database column type when creating table on write.

sureshthalamati force-pushed the jdbc_custom_dbtype-spark-10849 branch from 3834903 to 66c9e80 Compare January 15, 2017 20:16

gatorsmile mentioned this pull request Jan 17, 2017

Custom JDBC column types databricks/spark-redshift#220

Closed

sureshthalamati closed this Mar 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][SPARK-10849][SQL] Adds a new column metadata property to the jdbc data source for users to specify database column type using the metadata #16208

[WIP][SPARK-10849][SQL] Adds a new column metadata property to the jdbc data source for users to specify database column type using the metadata #16208

Uh oh!

sureshthalamati commented Dec 8, 2016 •

edited

Loading

Uh oh!

SparkQA commented Dec 8, 2016

Uh oh!

gatorsmile commented Dec 8, 2016 •

edited

Loading

Uh oh!

gatorsmile commented Dec 8, 2016

Uh oh!

SparkQA commented Dec 9, 2016

Uh oh!

mtrewartha commented Jan 12, 2017

Uh oh!

sureshthalamati commented Jan 12, 2017

Uh oh!

gatorsmile commented Jan 13, 2017

Uh oh!

SparkQA commented Jan 15, 2017

Uh oh!

gatorsmile commented Mar 24, 2017

Uh oh!

sureshthalamati commented Mar 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[WIP][SPARK-10849][SQL] Adds a new column metadata property to the jdbc data source for users to specify database column type using the metadata #16208

[WIP][SPARK-10849][SQL] Adds a new column metadata property to the jdbc data source for users to specify database column type using the metadata #16208

Uh oh!

Conversation

sureshthalamati commented Dec 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Dec 8, 2016

Uh oh!

gatorsmile commented Dec 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gatorsmile commented Dec 8, 2016

Uh oh!

SparkQA commented Dec 9, 2016

Uh oh!

mtrewartha commented Jan 12, 2017

Uh oh!

sureshthalamati commented Jan 12, 2017

Uh oh!

gatorsmile commented Jan 13, 2017

Uh oh!

SparkQA commented Jan 15, 2017

Uh oh!

gatorsmile commented Mar 24, 2017

Uh oh!

sureshthalamati commented Mar 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sureshthalamati commented Dec 8, 2016 •

edited

Loading

gatorsmile commented Dec 8, 2016 •

edited

Loading