@@ -296,6 +296,81 @@ backed by an RDD of its entries.
296296The underlying RDDs of a distributed matrix must be deterministic, because we cache the matrix size.
297297In general the use of non-deterministic RDDs can lead to errors.
298298
299+ ### BlockMatrix
300+
301+ A ` BlockMatrix ` is a distributed matrix backed by an RDD of ` MatrixBlock ` s, where ` MatrixBlock ` is
302+ a tuple of ` ((Int, Int), Matrix) ` , where the ` (Int, Int) ` is the index of the block, and ` Matrix ` is
303+ the sub-matrix at the given index with size ` rowsPerBlock ` x ` colsPerBlock ` .
304+ ` BlockMatrix ` supports methods such as ` .add ` and ` .multiply ` with another ` BlockMatrix ` .
305+ ` BlockMatrix ` also has a helper function ` .validate ` which can be used to debug whether the
306+ ` BlockMatrix ` is set up properly.
307+
308+ <div class =" codetabs " >
309+ <div data-lang =" scala " markdown =" 1 " >
310+
311+ A [ ` BlockMatrix ` ] ( api/scala/index.html#org.apache.spark.mllib.linalg.distributed.BlockMatrix ) can be
312+ most easily created from an ` IndexedRowMatrix ` or ` CoordinateMatrix ` using ` .toBlockMatrix() ` .
313+ ` .toBlockMatrix() ` will create blocks of size 1024 x 1024. Users may change the sizes of their blocks
314+ by supplying the values through ` .toBlockMatrix(rowsPerBlock, colsPerBlock) ` .
315+
316+ {% highlight scala %}
317+ import org.apache.spark.mllib.linalg.SingularValueDecomposition
318+ import org.apache.spark.mllib.linalg.distributed.{BlockMatrix, CoordinateMatrix, MatrixEntry}
319+
320+ val entries: RDD[ MatrixEntry] = ... // an RDD of (i, j, v) matrix entries
321+ // Create a CoordinateMatrix from an RDD[ MatrixEntry] .
322+ val coordMat: CoordinateMatrix = new CoordinateMatrix(entries)
323+ // Transform the CoordinateMatrix to a BlockMatrix
324+ val matA: BlockMatrix = coordMat.toBlockMatrix().cache()
325+
326+ // validate whether the BlockMatrix is set up properly. Throws an Exception when it is not valid.
327+ // Nothing happens if it is valid.
328+ matA.validate
329+
330+ // Calculate A^T A.
331+ val AtransposeA = matA.transpose.multiply(matA)
332+
333+ // get SVD of 2 * A
334+ val A2 = matA.add(matA)
335+ val svd = A2.toIndexedRowMatrix().computeSVD(20, false, 1e-9)
336+ {% endhighlight %}
337+ </div >
338+
339+ <div data-lang =" java " markdown =" 1 " >
340+
341+ A [ ` BlockMatrix ` ] ( api/scala/index.html#org.apache.spark.mllib.linalg.distributed.BlockMatrix ) can be
342+ most easily created from an ` IndexedRowMatrix ` or ` CoordinateMatrix ` using ` .toBlockMatrix() ` .
343+ ` .toBlockMatrix() ` will create blocks of size 1024 x 1024. Users may change the sizes of their blocks
344+ by supplying the values through ` .toBlockMatrix(rowsPerBlock, colsPerBlock) ` .
345+
346+ {% highlight java %}
347+ import org.apache.spark.api.java.JavaRDD;
348+ import org.apache.spark.mllib.linalg.SingularValueDecomposition;
349+ import org.apache.spark.mllib.linalg.distributed.BlockMatrix;
350+ import org.apache.spark.mllib.linalg.distributed.CoordinateMatrix;
351+ import org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix;
352+
353+ JavaRDD<MatrixEntry > entries = ... // a JavaRDD of (i, j, v) Matrix Entries
354+ // Create a CoordinateMatrix from a JavaRDD<MatrixEntry >.
355+ CoordinateMatrix coordMat = new CoordinateMatrix(entries.rdd());
356+ // Transform the CoordinateMatrix to a BlockMatrix
357+ BlockMatrix matA = coordMat.toBlockMatrix().cache();
358+
359+ // validate whether the BlockMatrix is set up properly. Throws an Exception when it is not valid.
360+ // Nothing happens if it is valid.
361+ matA.validate();
362+
363+ // Calculate A^T A.
364+ BlockMatrix AtransposeA = matA.transpose().multiply(matA);
365+
366+ // get SVD of 2 * A
367+ BlockMatrix A2 = matA.add(matA);
368+ SingularValueDecomposition<IndexedRowMatrix, Matrix> svd =
369+ A2.toIndexedRowMatrix().computeSVD(20, false, 1e-9);
370+ {% endhighlight %}
371+ </div >
372+ </div >
373+
299374### RowMatrix
300375
301376A ` RowMatrix ` is a row-oriented distributed matrix without meaningful row indices, backed by an RDD
0 commit comments