-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5821] [SQL] JSON CTAS command should throw error message when delete path failure #4610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #27506 has started for PR 4610 at commit
|
|
Test build #27506 has finished for PR 4610 at commit
|
|
Test FAILed. |
|
Test build #27507 has started for PR 4610 at commit
|
|
Test build #27507 has finished for PR 4610 at commit
|
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just save the data?
|
@yanbohappy It will be great if we use this PR to only do the task of throwing a nice error message when the |
|
Test build #27509 has started for PR 4610 at commit
|
|
Test build #27509 has finished for PR 4610 at commit
|
|
Test FAILed. |
|
@yhuai |
3cdfb7a to
5a42d83
Compare
|
Test build #27551 has started for PR 4610 at commit
|
|
Test build #27551 has finished for PR 4610 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanbohappy Seems we just throw another error message at here. Based on your JIRA description, I think you need to check if delete returns true or false when data already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yhuai Yes, it is expected. If there is no exception thrown, it can ensure delete success. I have investigated other codes in spark and they do the same thing for delete.
|
@yhuai what is the status here? |
|
Sorry for not following it. I will investigate it and reproduce the error reported in jira. I should have more ideas on how we want to have the change early next week. |
|
Test build #28643 has started for PR 4610 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we say "prior to writing to JSON file" since a user can reach this code path through DataFrame.save, which is not related to table operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, get rid of :\n and add a . at the end of the message.
|
Thank you for the update! It's very close. I think we want to do the same thing for cc @liancheng |
|
Test build #28860 has started for PR 4610 at commit
|
|
Test build #28860 has finished for PR 4610 at commit
|
|
Test PASSed. |
|
@yhuai Thanks for cc-ing me this. Although @yanboliang Thanks for pointing out the bug and working on this! |
|
@yhuai Thank you for your comments on this PR |
|
@yanboliang Sure, please go ahead. I'm trying to verify this issue locally. |
|
@yanboliang Adding a check for return value at this line should be enough. |
|
@liancheng Please review my fix for ParquetRelation2 in #5107 |
…ccessful Do the same check as #4610 for ParquetRelation2. Author: Yanbo Liang <[email protected]> Closes #5107 from yanboliang/spark-5821-parquet and squashes the following commits: 7092c8d [Yanbo Liang] ParquetRelation2 CTAS should check if delete is successful
…ccessful Do the same check as #4610 for ParquetRelation2. Author: Yanbo Liang <[email protected]> Closes #5107 from yanboliang/spark-5821-parquet and squashes the following commits: 7092c8d [Yanbo Liang] ParquetRelation2 CTAS should check if delete is successful (cherry picked from commit bc37c97) Signed-off-by: Cheng Lian <[email protected]>
…delete path failure When using "CREATE TEMPORARY TABLE AS SELECT" to create JSON table, we first delete the path file or directory and then generate a new directory with the same name. But if only read permission was granted, the delete failed. Here we just throwing an error message to let users know what happened. ParquetRelation2 may also hit this problem. I think to restrict JSONRelation and ParquetRelation2 must base on directory is more reasonable for access control. Maybe I can do it in follow up works. Author: Yanbo Liang <[email protected]> Author: Yanbo Liang <[email protected]> Closes #4610 from yanboliang/jsonInsertImprovements and squashes the following commits: c387fce [Yanbo Liang] fix typos 42d7fb6 [Yanbo Liang] add unittest & fix output format 46f0d9d [Yanbo Liang] Update JSONRelation.scala e2df8d5 [Yanbo Liang] check path exisit when write 79f7040 [Yanbo Liang] Update JSONRelation.scala e4bc229 [Yanbo Liang] Update JSONRelation.scala 5a42d83 [Yanbo Liang] JSONRelation CTAS should check if delete is successful (cherry picked from commit e5d2c37) Signed-off-by: Cheng Lian <[email protected]>
|
Thanks, merged into master and branch-1.3. |
|
PS: #5107 has also been merged. |
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
) * [SPARK-45967][BUILD] Upgrade jackson to 2.16.0 ### What changes were proposed in this pull request? The pr aims to upgrade FasterXML jackson from 2.15.2 to 2.16.0. ### Why are the changes needed? New version that fix some bugs, release notes as follows: - 2.1.6.0 https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.16, eg: [Databind](https://github.com/FasterXML/jackson-databind) [apache#1770](FasterXML/jackson-databind#1770): Incorrect deserialization for BigDecimal numbers - 2.15.3 https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.15.3, eg: [Databind](https://github.com/FasterXML/jackson-databind) [apache#3968](FasterXML/jackson-databind#3968): Records with additional constructors failed to deserialize The last upgrade occurred 6 months ago, apache#41414 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#43859 from panbingkun/SPARK-45967. Authored-by: panbingkun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit c17b092) * [SPARK-48826][BUILD] Upgrade `fasterxml.jackson` to 2.17.2 ### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 65daff5) * fix --------- Co-authored-by: panbingkun <[email protected]> Co-authored-by: Wei Guo <[email protected]>
When using "CREATE TEMPORARY TABLE AS SELECT" to create JSON table, we first delete the path file or directory and then generate a new directory with the same name. But if only read permission was granted, the delete failed.
Here we just throwing an error message to let users know what happened.
ParquetRelation2 may also hit this problem. I think to restrict JSONRelation and ParquetRelation2 must base on directory is more reasonable for access control. Maybe I can do it in follow up works.