Commit d56c262
[SPARK-21681][ML] fix bug of MLOR do not work correctly when featureStd contains zero
## What changes were proposed in this pull request?
fix bug of MLOR do not work correctly when featureStd contains zero
We can reproduce the bug through such dataset (features including zero variance), will generate wrong result (all coefficients becomes 0)
```
val multinomialDatasetWithZeroVar = {
val nPoints = 100
val coefficients = Array(
-0.57997, 0.912083, -0.371077,
-0.16624, -0.84355, -0.048509)
val xMean = Array(5.843, 3.0)
val xVariance = Array(0.6856, 0.0) // including zero variance
val testData = generateMultinomialLogisticInput(
coefficients, xMean, xVariance, addIntercept = true, nPoints, seed)
val df = sc.parallelize(testData, 4).toDF().withColumn("weight", lit(1.0))
df.cache()
df
}
```
## How was this patch tested?
testcase added.
Author: WeichenXu <[email protected]>
Closes #18896 from WeichenXu123/fix_mlor_stdvalue_zero_bug.1 parent 01a8e46 commit d56c262
File tree
3 files changed
+118
-9
lines changed- mllib/src
- main/scala/org/apache/spark/ml/optim/aggregator
- test/scala/org/apache/spark/ml
- classification
- optim/aggregator
3 files changed
+118
-9
lines changedLines changed: 7 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
270 | 270 | | |
271 | 271 | | |
272 | 272 | | |
273 | | - | |
274 | | - | |
275 | | - | |
276 | | - | |
277 | | - | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
278 | 280 | | |
279 | 281 | | |
280 | 282 | | |
| |||
Lines changed: 78 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
| |||
99 | 100 | | |
100 | 101 | | |
101 | 102 | | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
102 | 120 | | |
103 | 121 | | |
104 | 122 | | |
| |||
112 | 130 | | |
113 | 131 | | |
114 | 132 | | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
115 | 138 | | |
116 | 139 | | |
117 | 140 | | |
| |||
1392 | 1415 | | |
1393 | 1416 | | |
1394 | 1417 | | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
| 1427 | + | |
| 1428 | + | |
| 1429 | + | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
| 1467 | + | |
| 1468 | + | |
| 1469 | + | |
| 1470 | + | |
| 1471 | + | |
| 1472 | + | |
1395 | 1473 | | |
1396 | 1474 | | |
1397 | 1475 | | |
| |||
Lines changed: 33 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
44 | 50 | | |
45 | 51 | | |
46 | 52 | | |
| |||
233 | 239 | | |
234 | 240 | | |
235 | 241 | | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
236 | 245 | | |
| 246 | + | |
237 | 247 | | |
238 | 248 | | |
239 | 249 | | |
240 | | - | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
241 | 256 | | |
242 | | - | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
243 | 265 | | |
244 | 266 | | |
| 267 | + | |
245 | 268 | | |
246 | 269 | | |
247 | 270 | | |
248 | 271 | | |
249 | | - | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
250 | 278 | | |
251 | | - | |
| 279 | + | |
| 280 | + | |
252 | 281 | | |
253 | 282 | | |
0 commit comments