Commit 19ea6ff
[SPARK-53130][SQL][PYTHON] Fix
### What changes were proposed in this pull request?
Changing the behavior of collated string types to return their collation in the `toJson` methods and to still keep backwards compatibility with older engine versions reading tables with collations by propagating this fix upstream in `StructField` where the collation will be removed from the type but still kept in the metadata.
### Why are the changes needed?
Old way of handling `toJson` meant that collated string types will not be able to be serialized and deserialized correctly unless they are a part of `StructField`. Initially, we thought that this is not a big deal, but then later we faced some issues regarding this, especially in pyspark which uses json primarily to parse types back and forth.
This could avoid hacky changes in future like the one in #51688 without changing any behavior for how tables/schemas work.
### Does this PR introduce _any_ user-facing change?
Technically yes, but it is a small change that should not impact any queries, just how StringType is represented when not in a StructField object.
### How was this patch tested?
New and existing unit tests.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #51850 from stefankandic/fixStringJson.
Authored-by: Stefan Kandic <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>toJson behavior of collated string types1 parent f6cd385 commit 19ea6ff
File tree
6 files changed
+84
-13
lines changed- python/pyspark/sql
- tests
- sql
- api/src/main/scala/org/apache/spark/sql/types
- catalyst/src/test/scala/org/apache/spark/sql/types
6 files changed
+84
-13
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
647 | 647 | | |
648 | 648 | | |
649 | 649 | | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
650 | 661 | | |
651 | 662 | | |
652 | 663 | | |
| |||
718 | 729 | | |
719 | 730 | | |
720 | 731 | | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
721 | 736 | | |
722 | 737 | | |
723 | 738 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
298 | 298 | | |
299 | 299 | | |
300 | 300 | | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | 301 | | |
305 | | - | |
| 302 | + | |
306 | 303 | | |
307 | 304 | | |
308 | 305 | | |
| |||
1058 | 1055 | | |
1059 | 1056 | | |
1060 | 1057 | | |
1061 | | - | |
| 1058 | + | |
1062 | 1059 | | |
1063 | 1060 | | |
1064 | 1061 | | |
1065 | 1062 | | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
1066 | 1091 | | |
1067 | 1092 | | |
1068 | 1093 | | |
| |||
1891 | 1916 | | |
1892 | 1917 | | |
1893 | 1918 | | |
| 1919 | + | |
1894 | 1920 | | |
1895 | 1921 | | |
1896 | 1922 | | |
| |||
2055 | 2081 | | |
2056 | 2082 | | |
2057 | 2083 | | |
| 2084 | + | |
| 2085 | + | |
| 2086 | + | |
2058 | 2087 | | |
2059 | 2088 | | |
2060 | 2089 | | |
| |||
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
| 129 | + | |
129 | 130 | | |
130 | 131 | | |
131 | 132 | | |
| |||
215 | 216 | | |
216 | 217 | | |
217 | 218 | | |
| 219 | + | |
218 | 220 | | |
219 | 221 | | |
220 | 222 | | |
| |||
Lines changed: 0 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | | - | |
22 | 20 | | |
23 | 21 | | |
24 | 22 | | |
| |||
90 | 88 | | |
91 | 89 | | |
92 | 90 | | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | 91 | | |
99 | 92 | | |
100 | 93 | | |
| |||
Lines changed: 20 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
78 | 97 | | |
79 | 98 | | |
80 | 99 | | |
| |||
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
1145 | 1146 | | |
1146 | 1147 | | |
1147 | 1148 | | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
1148 | 1160 | | |
1149 | 1161 | | |
1150 | 1162 | | |
| |||
1185 | 1197 | | |
1186 | 1198 | | |
1187 | 1199 | | |
| 1200 | + | |
1188 | 1201 | | |
1189 | 1202 | | |
1190 | 1203 | | |
| |||
0 commit comments