Skip to content

Conversation

@iRakson
Copy link
Contributor

@iRakson iRakson commented Nov 14, 2019

What changes were proposed in this pull request?

cast to decimal behaviour in SparkSQL is made consistent with PostgreSQL when spark.sql.dialect is set to PostgreSQL.

Why are the changes needed?

SparkSQL and PostgreSQL behaviour is very different when casting to decimal.
For e.g:
select cast('abc' to decimal) in SparkSQL gives null.
select cast('abc' to decimal) in PostgreSQL gives "invalid input syntax for type numeric"

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Manually tested. Test cases will be added soon

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@iRakson
Copy link
Contributor Author

iRakson commented Nov 14, 2019

cc @cloud-fan @maropu

@maropu
Copy link
Member

maropu commented Nov 15, 2019

Please remove WIP before requesting reviews? Anyway, can you wait until #26472 merged? That's because that pr is adding a base class for this kind of Pg-specific casts.

@dongjoon-hyun
Copy link
Member

Thank you for contribution, @iRakson . As you know, unfortunately, we decided to remove PostgreSQL dialect via SPARK-30125 (#26763). Sorry about that. I'll close this PR, too.

@cloud-fan
Copy link
Contributor

We can still have this feature, but not under pgsql dialect. The ANSI SQL standard also requires us to throw exception when casting invalid strings to numbers.

@iRakson can you open new PRs to bring back your cast behaviors changes under the spark.sql.ansi.enabled flag?

@iRakson
Copy link
Contributor Author

iRakson commented Dec 16, 2019

Yeah Sure. I will.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants