Commit 13bedc0
[SPARK-24329][SQL] Test for skipping multi-space lines
## What changes were proposed in this pull request?
The PR is a continue of #21380 . It checks cases that are handled by the code:
https://github.com/apache/spark/blob/e3de6ab30d52890eb08578e55eb4a5d2b4e7aa35/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala#L303-L304
Basically the code skips lines with one or many whitespaces, and lines with comments (see [filterCommentAndEmpty](https://github.com/apache/spark/blob/e3de6ab30d52890eb08578e55eb4a5d2b4e7aa35/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVUtils.scala#L47))
```scala
iter.filter { line =>
line.trim.nonEmpty && !line.startsWith(options.comment.toString)
}
```
Closes #21380
## How was this patch tested?
Added a test for the case described above.
Author: Maxim Gekk <[email protected]>
Author: Maxim Gekk <[email protected]>
Closes #21394 from MaxGekk/test-for-multi-space-lines.1 parent 3469f5c commit 13bedc0
File tree
2 files changed
+23
-0
lines changed- sql/core/src/test
- resources/test-data
- scala/org/apache/spark/sql/execution/datasources/csv
2 files changed
+23
-0
lines changedLines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
Lines changed: 15 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1368 | 1368 | | |
1369 | 1369 | | |
1370 | 1370 | | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
| 1377 | + | |
| 1378 | + | |
| 1379 | + | |
| 1380 | + | |
| 1381 | + | |
| 1382 | + | |
| 1383 | + | |
| 1384 | + | |
| 1385 | + | |
1371 | 1386 | | |
0 commit comments