Skip to content

Commit 2b36344

Browse files
srowenmengxr
authored andcommitted
SPARK-1675. Make clear whether computePrincipalComponents requires centered data
Just closing out this small JIRA, resolving with a comment change. Author: Sean Owen <[email protected]> Closes apache#1171 from srowen/SPARK-1675 and squashes the following commits: 45ee9b7 [Sean Owen] Add simple note that data need not be centered for computePrincipalComponents
1 parent c480537 commit 2b36344

File tree

1 file changed

+2
-0
lines changed
  • mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed

1 file changed

+2
-0
lines changed

mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,8 @@ class RowMatrix(
347347
* The principal components are stored a local matrix of size n-by-k.
348348
* Each column corresponds for one principal component,
349349
* and the columns are in descending order of component variance.
350+
* The row data do not need to be "centered" first; it is not necessary for
351+
* the mean of each column to be 0.
350352
*
351353
* @param k number of top principal components.
352354
* @return a matrix of size n-by-k, whose columns are principal components

0 commit comments

Comments
 (0)