Skip to content

non-allocating cumsum for SVector #730

@jkrumbiegel

Description

@jkrumbiegel

Hi, I noticed while optimizing some code that the only allocating bit that was left was a cumsum over an SVector. Because I thought it was an interesting challenge, I tried to make my first @generated function to fix this. Because I feel like it's really bad style (didn't know how to do it with quoting and interpolating), I'm posting it here first, before making a PR..

So this is the default behavior:

>>> svec = SVector(1, 2, 3, 4, 5)

>>> @btime cumsum($svec)

13.754 ns (1 allocation: 48 bytes)
5-element MArray{Tuple{5},Int64,1,5} with indices SOneTo(5):
  1
  3
  6
 10
 15

Now my function:

@generated function mycumsum(svec::SVector{N, T}) where {N, T}

    args = Any[]

    i = 1
    push!(args, :(v1 = svec[1]))

    for i in 2:N
        push!(args,
            Expr(
                Symbol("="),
                Symbol("v$i"),
                Expr(
                    Symbol("call"),
                    Symbol("+"),
                    Expr(
                        Symbol("ref"),
                        Symbol("svec"),
                        i
                    ),
                    Symbol("v$(i-1)")
                )
            )
        )
    end

    variables = [Symbol("v$i") for i in 1:N]
    push!(args, Expr(Symbol("call"), :SVector, variables...))

    exp = Expr(Symbol("block"), args...)

end

This is the expression that this function generates for svec:

exp = quote
    v1 = svec[1]
    v2 = svec[2] + v1
    v3 = svec[3] + v2
    v4 = svec[4] + v3
    v5 = svec[5] + v4
    SVector(v1, v2, v3, v4, v5)
end

And this is the result when using it:

@btime mycumsum($svec)
  0.031 ns (0 allocations: 0 bytes)
5-element SArray{Tuple{5},Int64,1,5} with indices SOneTo(5):
  1
  3
  6
 10
 15

So if I'm not mistaken, this is 444 times faster than the default version?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions