Commit 9d66c42
[SPARK-12057][SQL] Prevent failure on corrupt JSON records
This PR makes JSON parser and schema inference handle more cases where we have unparsed records. It is based on #10043. The last commit fixes the failed test and updates the logic of schema inference.
Regarding the schema inference change, if we have something like
```
{"f1":1}
[1,2,3]
```
originally, we will get a DF without any column.
After this change, we will get a DF with columns `f1` and `_corrupt_record`. Basically, for the second row, `[1,2,3]` will be the value of `_corrupt_record`.
When merge this PR, please make sure that the author is simplyianm.
JIRA: https://issues.apache.org/jira/browse/SPARK-12057
Closes #10043
Author: Ian Macalinao <[email protected]>
Author: Yin Huai <[email protected]>
Closes #10288 from yhuai/handleCorruptJson.1 parent 437583f commit 9d66c42
File tree
4 files changed
+90
-12
lines changed- sql/core/src
- main/scala/org/apache/spark/sql/execution/datasources/json
- test/scala/org/apache/spark/sql/execution/datasources/json
4 files changed
+90
-12
lines changedLines changed: 33 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
65 | 68 | | |
66 | 69 | | |
67 | 70 | | |
| |||
170 | 173 | | |
171 | 174 | | |
172 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
173 | 189 | | |
174 | 190 | | |
175 | 191 | | |
176 | | - | |
177 | | - | |
178 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
179 | 208 | | |
180 | 209 | | |
181 | 210 | | |
| |||
Lines changed: 12 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
| 35 | + | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
| |||
110 | 112 | | |
111 | 113 | | |
112 | 114 | | |
113 | | - | |
| 115 | + | |
114 | 116 | | |
115 | 117 | | |
116 | 118 | | |
| |||
127 | 129 | | |
128 | 130 | | |
129 | 131 | | |
130 | | - | |
| 132 | + | |
131 | 133 | | |
132 | 134 | | |
133 | 135 | | |
| |||
174 | 176 | | |
175 | 177 | | |
176 | 178 | | |
177 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
178 | 184 | | |
179 | 185 | | |
180 | 186 | | |
| |||
267 | 273 | | |
268 | 274 | | |
269 | 275 | | |
270 | | - | |
271 | | - | |
272 | | - | |
273 | | - | |
| 276 | + | |
274 | 277 | | |
275 | 278 | | |
276 | 279 | | |
277 | 280 | | |
278 | 281 | | |
| 282 | + | |
| 283 | + | |
279 | 284 | | |
280 | 285 | | |
281 | 286 | | |
| |||
Lines changed: 37 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1427 | 1427 | | |
1428 | 1428 | | |
1429 | 1429 | | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
1430 | 1467 | | |
Lines changed: 8 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
191 | 199 | | |
192 | 200 | | |
193 | 201 | | |
| |||
197 | 205 | | |
198 | 206 | | |
199 | 207 | | |
200 | | - | |
201 | 208 | | |
202 | 209 | | |
203 | 210 | | |
| |||
0 commit comments