Skip to content

Commit a35e4a9

Browse files
committed
reword netlib-java/breeze docs
I am the author of netlib-java and I found this documentation to be out of date. Some main points: 1. Breeze has not depended on jBLAS for some time 2. netlib-java provides a pure JVM implementation 3. The licensing issue is not just about LGPL: optimised natives have proprietary licenses. 4. I really think it's best to direct people to my detailed setup guide instead of trying to compress it into one sentence. It is different for each architecture, each OS, and for each backend. I hope this helps to clear things up 😄
1 parent 1390e56 commit a35e4a9

File tree

1 file changed

+25
-17
lines changed

1 file changed

+25
-17
lines changed

docs/mllib-guide.md

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -56,25 +56,33 @@ See the **[spark.ml programming guide](ml-guide.html)** for more information on
5656

5757
# Dependencies
5858

59-
MLlib uses the linear algebra package [Breeze](http://www.scalanlp.org/),
60-
which depends on [netlib-java](https://github.com/fommil/netlib-java),
61-
and [jblas](https://github.com/mikiobraun/jblas).
62-
`netlib-java` and `jblas` depend on native Fortran routines.
63-
You need to install the
59+
MLlib uses the linear algebra package
60+
[Breeze](http://www.scalanlp.org/), which depends on
61+
[netlib-java](https://github.com/fommil/netlib-java) for optimised
62+
numerical processing. If natives are not available at runtime, you
63+
will see a warning message and a pure JVM implementation will be used
64+
instead.
65+
66+
To learn more about the benefits and background of system optimised
67+
natives, you may wish to watch Sam Halliday's ScalaX talk on
68+
[High Performance Linear Algebra in Scala](https://skillsmatter.com/skillscasts/5849-high-performance-linear-algebra-in-scala)
69+
([follow along with high-res slides](http://fommil.github.io/scalax14/#/)).
70+
71+
Due to licensing issues with runtime proprietary binaries, we do not
72+
include `netlib-java`'s native proxies by default. To configure
73+
`netlib-java` / Breeze to use system optimised binaries, include
74+
`com.github.fommil.netlib:all:1.1.2` (or build Spark with
75+
`-Pnetlib-lgpl`) as a dependency of your project and read the
76+
[netlib-java](https://github.com/fommil/netlib-java) documentation for
77+
your platform's additional installation instructions.
78+
79+
MLlib also uses [jblas](https://github.com/mikiobraun/jblas) which
80+
will require you to install the
6481
[gfortran runtime library](https://github.com/mikiobraun/jblas/wiki/Missing-Libraries)
6582
if it is not already present on your nodes.
66-
MLlib will throw a linking error if it cannot detect these libraries automatically.
67-
Due to license issues, we do not include `netlib-java`'s native libraries in MLlib's
68-
dependency set under default settings.
69-
If no native library is available at runtime, you will see a warning message.
70-
To use native libraries from `netlib-java`, please build Spark with `-Pnetlib-lgpl` or
71-
include `com.github.fommil.netlib:all:1.1.2` as a dependency of your project.
72-
If you want to use optimized BLAS/LAPACK libraries such as
73-
[OpenBLAS](http://www.openblas.net/), please link its shared libraries to
74-
`/usr/lib/libblas.so.3` and `/usr/lib/liblapack.so.3`, respectively.
75-
BLAS/LAPACK libraries on worker nodes should be built without multithreading.
76-
77-
To use MLlib in Python, you will need [NumPy](http://www.numpy.org) version 1.4 or newer.
83+
84+
To use MLlib in Python, you will need [NumPy](http://www.numpy.org)
85+
version 1.4 or newer.
7886

7987
---
8088

0 commit comments

Comments
 (0)