[SPARK-10849][SQL] Adding field metadata property to override default jdbc data source type mapping. #9352

sureshthalamati · 2015-10-29T08:33:40Z

This patch allows users to override default type mapping of data frame field to database column type when writing data frame to jdbc data sources.

In some cases user may want to use specific database data type mapping for fields based on the database configuration (page size , type of table spaces ..etc) instead of the defaults. For example large varchar size for all the columns may not fit in row size limits , user may want to use mix of varchar , and clob types. Max precision supported in some database systems might be less than the spark decimal precision, in such cases user can use this option to adjust the decimal type precision , and scale to match the target database.

Added a new field meta data property name db.column.type . I am not sure what is the convention for these type of property names. Please let me know it it needs to be changed.

@rxin @marmbrus

…pe mapping.

marmbrus · 2015-10-29T10:40:13Z

ok to test

SparkQA · 2015-10-29T12:51:33Z

Test build #44586 has finished for PR 9352 at commit 4048c2d.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

sureshthalamati · 2015-10-29T18:21:04Z

Failed test is test_trainOn_predictOn (main.StreamingKMeansTest). It seems to be unrelated to my changes. Can we retest this please.

sureshthalamati · 2015-10-30T01:21:41Z

Thinking about this more , I realized current version of the patch may introduce SQL injection. I will update the pull request with a new version of the fix.

sureshthalamati · 2015-10-30T19:25:46Z

Updated the patch to address sql injection issue by removing space characters from the input. Please review.

@marmbrus @rxin

SparkQA · 2015-10-30T21:15:18Z

Test build #44699 has finished for PR 9352 at commit ef26084.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

sureshthalamati · 2015-10-31T01:21:30Z

retest this please.

Test failure is unrelated to my changes. Failed test is org.apache.spark.sql.sources.JsonHadoopFsRelationSuite.test all data types - TimestampType. It passes in my environment.

rick-ibm · 2015-11-04T18:36:08Z

Thanks for addressing the SQL injection concerns, Suresh. LGTM.

sureshthalamati · 2015-11-16T08:31:05Z

Jenkins, retest this please.

tristanreid · 2016-01-18T17:55:57Z

Anyone know the status of this change? Is there anything blocking, or was it superceded by something else? Thanks...

AmplabJenkins · 2016-04-29T09:17:14Z

Can one of the admins verify this patch?

sureshthalamati · 2016-12-08T08:04:53Z

Opened two new [WIP] PRs to fix this issue using different approaches.
#16208
#16209

Closing this PR.

Adding field metadata property to override jdbc data source column ty…

4048c2d

…pe mapping.

removing space characted from input to avoid any sql injection scenarios

ef26084

sureshthalamati closed this Dec 8, 2016

[SPARK-10849][SQL] Adding field metadata property to override default jdbc data source type mapping. #9352

[SPARK-10849][SQL] Adding field metadata property to override default jdbc data source type mapping. #9352

Uh oh!

Conversation

sureshthalamati commented Oct 29, 2015

Uh oh!

marmbrus commented Oct 29, 2015

Uh oh!

SparkQA commented Oct 29, 2015

Uh oh!

sureshthalamati commented Oct 29, 2015

Uh oh!

sureshthalamati commented Oct 30, 2015

Uh oh!

sureshthalamati commented Oct 30, 2015

Uh oh!

SparkQA commented Oct 30, 2015

Uh oh!

sureshthalamati commented Oct 31, 2015

Uh oh!

rick-ibm commented Nov 4, 2015

Uh oh!

sureshthalamati commented Nov 16, 2015

Uh oh!

tristanreid commented Jan 18, 2016

Uh oh!

AmplabJenkins commented Apr 29, 2016

Uh oh!

sureshthalamati commented Dec 8, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants