Commit 5446f23
[SPARK-25753][CORE] fix reading small files via BinaryFileRDD
## What changes were proposed in this pull request?
This is a follow up of apache#21601, `StreamFileInputFormat` and `WholeTextFileInputFormat` have the same problem.
`Minimum split size pernode 5123456 cannot be larger than maximum split size 4194304
java.io.IOException: Minimum split size pernode 5123456 cannot be larger than maximum split size 4194304
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java: 201)
at org.apache.spark.rdd.BinaryFileRDD.getPartitions(BinaryFileRDD.scala:52)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:254)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:252)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2138)`
## How was this patch tested?
Added a unit test
Closes apache#22725 from 10110346/maxSplitSize_node_rack.
Authored-by: liuxian <[email protected]>
Signed-off-by: Thomas Graves <[email protected]>1 parent 5dafdb0 commit 5446f23
File tree
2 files changed
+25
-0
lines changed- core/src
- main/scala/org/apache/spark/input
- test/scala/org/apache/spark
2 files changed
+25
-0
lines changedLines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
55 | 67 | | |
56 | 68 | | |
57 | 69 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
320 | 320 | | |
321 | 321 | | |
322 | 322 | | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
323 | 336 | | |
324 | 337 | | |
325 | 338 | | |
| |||
0 commit comments