Skip to content
This repository was archived by the owner on Jan 10, 2019. It is now read-only.
This repository was archived by the owner on Jan 10, 2019. It is now read-only.

N-dimensional arrays #7

@max-mapper

Description

@max-mapper

Numpy kicked off what eventually evolved into the Pydata stack. One of the core components of Numpy was an ndarray type. What's an ndarray? It's a way to efficiently store multidimensional data (n-dimension, where you pick the n, common ones are 1, 2, 3 and 4 dimensions but you can go as high as your data requires). For example, a photo would be a 2 dimensional ndarray, with X and Y coordinates of the pixels. A volume is 3 dimensional, with X, Y and Z, etc.

You can obviously do this with nested JavaScript Array objects for example, e.g.

var array = [
  [1, 0, 1],
  [2, 1, 0],
  [1, 3, 1]
]
var x = array[1][2]

The downside of this is memory overhead. Each array is a JS Object. This makes it slower to access data because of the pointer dereferencing from object -> value that JS has to do, and also makes it take up more memory to store data because of the overhead of storing JS Objects. It's also hard to lay out the data to exploit memory caching and hard to modify/slice.

For more of an introduction to ndarrays, watch this talk by @mikolalysenko https://www.youtube.com/watch?v=vNCeWK_Wb5k, specifically the first 10 minutes.

Numpy ndarrays are implemented in C, and exposed as a Python API. They can get around the issue with Object overhead mentioned above by doing memory management in C.

ndarrays in JavaScript

There are two projects (that I'm aware of) working on ndarrays in JS

At first glance they appear quite similar in API to each other. There are surely implementation differences, but I am not knowledgable enough about these to explain them here. Perhaps the authors will chime in below.

Both of these are written in pure JS. Typed Objects and SIMD could increase memory bandwidth for these implementations.

Another approach would be to write native bindings, for example the work-in-progress vectorious JS module has a Matrix library for 2 dimensional data that is backed by lapack and blas, both of which are well known, time tested linear algebra native libraries.

This native binding approach is similar to Numpy etc. The upside is being able to re-use existing native implementations, the downside is figuring out how to run it in the browser. See #6 for more discussion on this point.

In summary, there are a couple ndarray implementations you can use today, but there are definitely things that JS as a platform can add to make writing an ndarray implementation better in the future.

Open questions

  • What are the key differences between scijs/ndarray and dstructs/ndarray?
  • Could the lapack and blas bindings from @mateogianolio be used to write a native ndarray implementation (similar in API to numpy ndarrays)?
  • What are the key missing features from JS that would improve ndarrays the most?

Have feedback or comments? Leave a comment below

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions