I found, that using MKL library functions like vsTanh (https://software.intel.com/en-us/mkl-developer-reference-fortran-v-tanh) is quite faster than doing `vector.mapv(|x| x.tanh())`. Is it worth including this in ndarray crate behind feature gate?