You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What changes were proposed in this pull request?
1, avoid `Iterator.grouped(size: Int)`, which need to maintain an arraybuffer of `size`
2, keep the number of partitions in curve computation
### Why are the changes needed?
1, `BinaryClassificationMetrics` tend to fail (OOM) when `grouping=count/numBins` is too large, due to `Iterator.grouped(size: Int)` need to maintain an arraybuffer with `size` entries, however, in `BinaryClassificationMetrics` we do not need to maintain such a big array;
2, make sizes of partitions more even;
This PR computes metrics more stable and a littler faster;
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
existing testsuites
Closesapache#27682 from zhengruifeng/grouped_opt.
Authored-by: zhengruifeng <[email protected]>
Signed-off-by: zhengruifeng <[email protected]>
0 commit comments