Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 24 additions & 17 deletions docs/mllib-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,25 +56,32 @@ See the **[spark.ml programming guide](ml-guide.html)** for more information on

# Dependencies

MLlib uses the linear algebra package [Breeze](http://www.scalanlp.org/),
which depends on [netlib-java](https://github.com/fommil/netlib-java),
and [jblas](https://github.com/mikiobraun/jblas).
`netlib-java` and `jblas` depend on native Fortran routines.
You need to install the
MLlib uses the linear algebra package
[Breeze](http://www.scalanlp.org/), which depends on
[netlib-java](https://github.com/fommil/netlib-java) for optimised
numerical processing. If natives are not available at runtime, you
will see a warning message and a pure JVM implementation will be used
instead.

To learn more about the benefits and background of system optimised
natives, you may wish to watch Sam Halliday's ScalaX talk on
[High Performance Linear Algebra in Scala](http://fommil.github.io/scalax14/#/)).

Due to licensing issues with runtime proprietary binaries, we do not
include `netlib-java`'s native proxies by default. To configure
`netlib-java` / Breeze to use system optimised binaries, include
`com.github.fommil.netlib:all:1.1.2` (or build Spark with
`-Pnetlib-lgpl`) as a dependency of your project and read the
[netlib-java](https://github.com/fommil/netlib-java) documentation for
your platform's additional installation instructions.

MLlib also uses [jblas](https://github.com/mikiobraun/jblas) which
will require you to install the
[gfortran runtime library](https://github.com/mikiobraun/jblas/wiki/Missing-Libraries)
if it is not already present on your nodes.
MLlib will throw a linking error if it cannot detect these libraries automatically.
Due to license issues, we do not include `netlib-java`'s native libraries in MLlib's
dependency set under default settings.
If no native library is available at runtime, you will see a warning message.
To use native libraries from `netlib-java`, please build Spark with `-Pnetlib-lgpl` or
include `com.github.fommil.netlib:all:1.1.2` as a dependency of your project.
If you want to use optimized BLAS/LAPACK libraries such as
[OpenBLAS](http://www.openblas.net/), please link its shared libraries to
`/usr/lib/libblas.so.3` and `/usr/lib/liblapack.so.3`, respectively.
BLAS/LAPACK libraries on worker nodes should be built without multithreading.

To use MLlib in Python, you will need [NumPy](http://www.numpy.org) version 1.4 or newer.

To use MLlib in Python, you will need [NumPy](http://www.numpy.org)
version 1.4 or newer.

---

Expand Down