Skip to content

Enhancement proposal: restructure package around knot-vectors and their indices #227

@timholy

Description

@timholy

Plugging away at #226 has been the occasion for reconsidering the overall structure of the package. I've been slowly evolving towards one significant internal API change (introducing WeightedIndex) which hopefully you'll see soon as part of #226, so I won't bother describing it here. Here, I want to focus on a much more visible form of restructuring that may not be part of #226 but which I think is worth considering. This proposal stems from our accumulated experience thinking about indexing, arrays, and other container types elsewhere in the Julia ecosystem (CCing @mbauman and @andyferris, who have helped shape these views).

We currently have 3 interpolation types: BSpline, Scaled, and Gridded. (We also have extrapolation types, but that's orthogonal to the content of this proposal.) The core point of this proposal is that I believe we can:

  • unify all 3 interpolation types into a single type (probably called BSpline)
  • eliminate the OnCell vs OnGrid distinction
  • support attractive new features like encoding position in physical units (CC @ajkeller34)

by centralizing everything around the idea of knot vectors.

As you all know, the knots are the places where we shift the parametrization of the underlying polynomial: when you cross a knot you end up accessing different elements of the underlying coefficient array. For an array axis spanning 1:n, currently we have the following:

  • OnGrid: the knots are effectively positioned at 1:n
  • OnCell: the knots are effectively positioned at UnitRange(0.5, n+0.5)

Another important component is padding, which is currently expressed through a type-parameter of the appropriate AbstractInterpolation type. Now that we have OffsetArrays, a much cleaner approach (which you'll see in #226) is to use the coefficient array's axes to encode the padding: for example, quadratic interpolation requires an extra padding element on either end, and so the natural approach would be to index this axis of the coefficient array with integers that range from 0 to n+1, where the 0th and n+1st elements are the padding. By using an OffsetArray it's easy to keep everything in alignment.

Let's examine some implicit elements of the scheme above a bit more deeply. Note that the knot-vector actually has two properties: each element has a position, but we also address it via an index. That is to say, the position of the kth knot is knotvec[k] where k is the index. A helpful analogy is to a Dict, which has key/value pairs; here, the key is the index and the value is the position. If you think about it a bit, the important thing is that it's the indices of the knot vector that specify the corresponding indices of the coefficient array: if the position x at which you're trying to evaluate the interpolation lies between knotvec[k] and knotvec[k+1], then for linear interpolation I'm going to be accessing the coefficient array at indices k and k+1. (x and the values of knotvec affect the weights in a continuous fashion, but the indices are determined discontinuously by the k satisfying knotvec[k] <= x <= knotvec[k+1].)

Realizing that the knot vector's indices, rather than values, encode the alignment with the coefficient array, we come to notice that we're not really making good use of the knot-vector's values, at least for the core BSpline type. We do, however, formalize these more for the Scaled and Gridded types. Pursuing this just a bit, we come up with the following scheme:

  • our current BSplineInterpolation is really a BSplineInterpolation{K} where K<:AbstractUnitRange
  • our current ScaledInterpolation is effectively a BSplineInterpolation{K} where K<:Union{StepRange,StepRangeLen}
  • our current GriddedInterpolation is effectively a BSplineInterpolation{K} where K<:AbstractVector (the knots must be sorted)

Consequently, all three "top level types" become variants of one overall type with different knot vectors. Unification has some significant advantages, among them being that it lets me mix-and-match: I could have an interpolation object which has p axes with AbstractRange-type knot vectors and q axes with AbstractVector-type knot vectors. Currently, if you have just one AbstractVector knot-vector, you have to use GriddedInterpolation and use them for all of your axes. Looking up the index in a sorted AbstractVector is an O(logn) process, but it's O(1) for AbstractRange, so being forced to use slow lookup for all of them is a bummer.

The full (aka, most general offering the greatest degree of control) user-level API might look something like this:

itp = interpolate(A, BSpline(degree1, knotvec1), BSpline(degree2, knotvec2), ...)

As a concrete example, for a 5x4 array A, our current

itp =  interpolate(A, (BSpline(Linear()), BSpline(Quadratic(Flat())), (OnGrid(), OnCell()))

would be replaced by

itp = interpolate(A, BSpline(Linear(), 1:5), BSpline(Quadratic(Flat()), 0.5:4.5))

But we could also do fun things like

itp = interpolate(A, BSpline(Linear(), 0mm:0.4mm:1.6mm), BSpline(Quadratic(Flat()), [2s, 3.8s, 4.3s, 8.6s, 9.2s])

and access values as itp(0.83mm, 5.2s). (This particular example represents a series of measurements taken at regular spatial positions but sampled non-uniformly in time; if the 2nd "time axis" had been an AbstractRange it would have represented regular temporal sampling.) This makes more sense than ever, now that we're moving towards using () rather than [] for indexing at positions. We can still use [] for looking up Integer locations (and my upcoming WeightedIndex type) that are divorced from the positional meaning of the knot vectors.

That's pretty much it. Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions