From a35e4a960e2ff99903c1861c3c175e5f10d95cd6 Mon Sep 17 00:00:00 2001 From: Sam Halliday Date: Sat, 7 Feb 2015 01:06:51 +0000 Subject: [PATCH 1/2] reword netlib-java/breeze docs I am the author of netlib-java and I found this documentation to be out of date. Some main points: 1. Breeze has not depended on jBLAS for some time 2. netlib-java provides a pure JVM implementation 3. The licensing issue is not just about LGPL: optimised natives have proprietary licenses. 4. I really think it's best to direct people to my detailed setup guide instead of trying to compress it into one sentence. It is different for each architecture, each OS, and for each backend. I hope this helps to clear things up :smile: --- docs/mllib-guide.md | 42 +++++++++++++++++++++++++----------------- 1 file changed, 25 insertions(+), 17 deletions(-) diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md index 7779fbc9c49e..bcb3c1f15308 100644 --- a/docs/mllib-guide.md +++ b/docs/mllib-guide.md @@ -56,25 +56,33 @@ See the **[spark.ml programming guide](ml-guide.html)** for more information on # Dependencies -MLlib uses the linear algebra package [Breeze](http://www.scalanlp.org/), -which depends on [netlib-java](https://github.com/fommil/netlib-java), -and [jblas](https://github.com/mikiobraun/jblas). -`netlib-java` and `jblas` depend on native Fortran routines. -You need to install the +MLlib uses the linear algebra package +[Breeze](http://www.scalanlp.org/), which depends on +[netlib-java](https://github.com/fommil/netlib-java) for optimised +numerical processing. If natives are not available at runtime, you +will see a warning message and a pure JVM implementation will be used +instead. + +To learn more about the benefits and background of system optimised +natives, you may wish to watch Sam Halliday's ScalaX talk on +[High Performance Linear Algebra in Scala](https://skillsmatter.com/skillscasts/5849-high-performance-linear-algebra-in-scala) +([follow along with high-res slides](http://fommil.github.io/scalax14/#/)). + +Due to licensing issues with runtime proprietary binaries, we do not +include `netlib-java`'s native proxies by default. To configure +`netlib-java` / Breeze to use system optimised binaries, include +`com.github.fommil.netlib:all:1.1.2` (or build Spark with +`-Pnetlib-lgpl`) as a dependency of your project and read the +[netlib-java](https://github.com/fommil/netlib-java) documentation for +your platform's additional installation instructions. + +MLlib also uses [jblas](https://github.com/mikiobraun/jblas) which +will require you to install the [gfortran runtime library](https://github.com/mikiobraun/jblas/wiki/Missing-Libraries) if it is not already present on your nodes. -MLlib will throw a linking error if it cannot detect these libraries automatically. -Due to license issues, we do not include `netlib-java`'s native libraries in MLlib's -dependency set under default settings. -If no native library is available at runtime, you will see a warning message. -To use native libraries from `netlib-java`, please build Spark with `-Pnetlib-lgpl` or -include `com.github.fommil.netlib:all:1.1.2` as a dependency of your project. -If you want to use optimized BLAS/LAPACK libraries such as -[OpenBLAS](http://www.openblas.net/), please link its shared libraries to -`/usr/lib/libblas.so.3` and `/usr/lib/liblapack.so.3`, respectively. -BLAS/LAPACK libraries on worker nodes should be built without multithreading. - -To use MLlib in Python, you will need [NumPy](http://www.numpy.org) version 1.4 or newer. + +To use MLlib in Python, you will need [NumPy](http://www.numpy.org) +version 1.4 or newer. --- From 18cda1113bd9e7f349d97a575783860f814adfbf Mon Sep 17 00:00:00 2001 From: Sam Halliday Date: Sat, 7 Feb 2015 20:16:36 +0000 Subject: [PATCH 2/2] remove link to skillsmatters at request of @mengxr --- docs/mllib-guide.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md index bcb3c1f15308..3d32d03e35c6 100644 --- a/docs/mllib-guide.md +++ b/docs/mllib-guide.md @@ -65,8 +65,7 @@ instead. To learn more about the benefits and background of system optimised natives, you may wish to watch Sam Halliday's ScalaX talk on -[High Performance Linear Algebra in Scala](https://skillsmatter.com/skillscasts/5849-high-performance-linear-algebra-in-scala) -([follow along with high-res slides](http://fommil.github.io/scalax14/#/)). +[High Performance Linear Algebra in Scala](http://fommil.github.io/scalax14/#/)). Due to licensing issues with runtime proprietary binaries, we do not include `netlib-java`'s native proxies by default. To configure