You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-5186] [MLLIB] Vector.equals and Vector.hashCode are very inefficient
JIRA Issue: https://issues.apache.org/jira/browse/SPARK-5186
Currently SparseVector is using the inherited equals from Vector, which will create a full-size array for even the sparse vector. The pull request contains a specialized equals optimization that improves on both time and space.
1. The implementation will be consistent with the original. Especially it will keep equality comparison between SparseVector and DenseVector.
Author: Yuhao Yang <[email protected]>
Author: Yuhao Yang <[email protected]>
Closesapache#3997 from hhbyyh/master and squashes the following commits:
0d9d130 [Yuhao Yang] function name change and ut update
93f0d46 [Yuhao Yang] unify sparse vs dense vectors
985e160 [Yuhao Yang] improve locality for equals
bdf8789 [Yuhao Yang] improve equals and rewrite hashCode for Vector
a6952c3 [Yuhao Yang] fix scala style for comments
50abef3 [Yuhao Yang] fix ut for sparse vector with explicit 0
f41b135 [Yuhao Yang] iterative equals for sparse vector
5741144 [Yuhao Yang] Specialized equals for SparseVector
0 commit comments