Currently it slows down to a crawl when working with big arrays.
The vmin/vmax computation could be the culprit. Being fast is more important than having perfect colors, so IF that is indeed the cause of the slowdown, we should use a sample, like for computing ndigits. But then the color code needs to be adapted to cope with having colorval > vmax OR, preferably updating vmin/vmax as we go, when loading more data.