Commit 54d9575
[MLLIB] SPARK-4846: throw a RuntimeException and give users hints to increase the minCount
When the vocabSize\*vectorSize is larger than Int.MaxValue/8, we try to throw a RuntimeException. Because under this circumstance it would definitely throw an OOM when allocating memory to serialize the arrays syn0Global&syn1Global. syn0Global&syn1Global are float arrays. Serializing them should need a byte array of more than 8 times of syn0Global's size.
Also if we catch an OOM even if vocabSize\*vectorSize is less than Int.MaxValue/8, we should give users hints to increase the minCount or decrease the vectorSize.
Author: Joseph J.C. Tang <[email protected]>
Closes #4247 from jinntrance/w2v-fix and squashes the following commits:
b5eb71f [Joseph J.C. Tang] throw a RuntimeException and give users hints regarding the vectorSize&minCount1 parent 254eaa4 commit 54d9575
File tree
1 file changed
+7
-0
lines changed- mllib/src/main/scala/org/apache/spark/mllib/feature
1 file changed
+7
-0
lines changedLines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
290 | 290 | | |
291 | 291 | | |
292 | 292 | | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
293 | 300 | | |
294 | 301 | | |
295 | 302 | | |
| |||
0 commit comments