Commit 94c67a7
[SPARK-23207][SQL] Shuffle+Repartition on a DataFrame could lead to incorrect answers
## What changes were proposed in this pull request?
Currently shuffle repartition uses RoundRobinPartitioning, the generated result is nondeterministic since the sequence of input rows are not determined.
The bug can be triggered when there is a repartition call following a shuffle (which would lead to non-deterministic row ordering), as the pattern shows below:
upstream stage -> repartition stage -> result stage
(-> indicate a shuffle)
When one of the executors process goes down, some tasks on the repartition stage will be retried and generate inconsistent ordering, and some tasks of the result stage will be retried generating different data.
The following code returns 931532, instead of 1000000:
```
import scala.sys.process._
import org.apache.spark.TaskContext
val res = spark.range(0, 1000 * 1000, 1).repartition(200).map { x =>
x
}.repartition(200).map { x =>
if (TaskContext.get.attemptNumber == 0 && TaskContext.get.partitionId < 2) {
throw new Exception("pkill -f java".!!)
}
x
}
res.distinct().count()
```
In this PR, we propose a most straight-forward way to fix this problem by performing a local sort before partitioning, after we make the input row ordering deterministic, the function from rows to partitions is fully deterministic too.
The downside of the approach is that with extra local sort inserted, the performance of repartition() will go down, so we add a new config named `spark.sql.execution.sortBeforeRepartition` to control whether this patch is applied. The patch is default enabled to be safe-by-default, but user may choose to manually turn it off to avoid performance regression.
This patch also changes the output rows ordering of repartition(), that leads to a bunch of test cases failure because they are comparing the results directly.
## How was this patch tested?
Add unit test in ExchangeSuite.
With this patch(and `spark.sql.execution.sortBeforeRepartition` set to true), the following query returns 1000000:
```
import scala.sys.process._
import org.apache.spark.TaskContext
spark.conf.set("spark.sql.execution.sortBeforeRepartition", "true")
val res = spark.range(0, 1000 * 1000, 1).repartition(200).map { x =>
x
}.repartition(200).map { x =>
if (TaskContext.get.attemptNumber == 0 && TaskContext.get.partitionId < 2) {
throw new Exception("pkill -f java".!!)
}
x
}
res.distinct().count()
res7: Long = 1000000
```
Author: Xingbo Jiang <[email protected]>
Closes #20393 from jiangxb1987/shuffle-repartition.1 parent a8a3e9b commit 94c67a7
File tree
17 files changed
+233
-29
lines changed- core/src
- main
- java/org/apache/spark/util/collection/unsafe/sort
- scala/org/apache/spark/rdd
- test/java/org/apache/spark/util/collection/unsafe/sort
- mllib/src/test/scala/org/apache/spark/ml/feature
- sql
- catalyst/src/main
- java/org/apache/spark/sql/execution
- scala/org/apache/spark/sql/internal
- core/src
- main
- java/org/apache/spark/sql/execution
- scala/org/apache/spark/sql/execution
- exchange
- test/scala/org/apache/spark/sql/execution
- datasources
- parquet
- text
- streaming
17 files changed
+233
-29
lines changedLines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | | - | |
| 37 | + | |
| 38 | + | |
37 | 39 | | |
Lines changed: 4 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
66 | 65 | | |
| 66 | + | |
67 | 67 | | |
68 | | - | |
69 | 68 | | |
70 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
71 | 72 | | |
72 | 73 | | |
73 | 74 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
39 | | - | |
| 38 | + | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
414 | 414 | | |
415 | 415 | | |
416 | 416 | | |
| 417 | + | |
| 418 | + | |
417 | 419 | | |
418 | 420 | | |
419 | 421 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| 75 | + | |
75 | 76 | | |
76 | | - | |
| 77 | + | |
| 78 | + | |
77 | 79 | | |
78 | 80 | | |
79 | 81 | | |
| |||
Lines changed: 6 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
101 | 102 | | |
102 | | - | |
| 103 | + | |
| 104 | + | |
103 | 105 | | |
104 | 106 | | |
105 | 107 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| 169 | + | |
167 | 170 | | |
168 | | - | |
| 171 | + | |
| 172 | + | |
169 | 173 | | |
170 | 174 | | |
171 | 175 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
225 | | - | |
| 225 | + | |
| 226 | + | |
226 | 227 | | |
227 | 228 | | |
228 | 229 | | |
| |||
Lines changed: 70 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
Lines changed: 38 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
| 23 | + | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
| |||
56 | 58 | | |
57 | 59 | | |
58 | 60 | | |
59 | | - | |
| 61 | + | |
60 | 62 | | |
61 | 63 | | |
62 | | - | |
| 64 | + | |
63 | 65 | | |
64 | 66 | | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
68 | 70 | | |
69 | | - | |
| 71 | + | |
70 | 72 | | |
71 | 73 | | |
72 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
73 | 86 | | |
74 | 87 | | |
75 | 88 | | |
76 | 89 | | |
77 | 90 | | |
78 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
79 | 105 | | |
80 | 106 | | |
81 | 107 | | |
| |||
85 | 111 | | |
86 | 112 | | |
87 | 113 | | |
88 | | - | |
| 114 | + | |
89 | 115 | | |
90 | 116 | | |
91 | 117 | | |
| |||
206 | 232 | | |
207 | 233 | | |
208 | 234 | | |
209 | | - | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
210 | 242 | | |
211 | 243 | | |
212 | 244 | | |
| |||
Lines changed: 14 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1145 | 1145 | | |
1146 | 1146 | | |
1147 | 1147 | | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
1148 | 1160 | | |
1149 | 1161 | | |
1150 | 1162 | | |
| |||
1300 | 1312 | | |
1301 | 1313 | | |
1302 | 1314 | | |
| 1315 | + | |
| 1316 | + | |
1303 | 1317 | | |
1304 | 1318 | | |
1305 | 1319 | | |
| |||
0 commit comments