- 
                Notifications
    You must be signed in to change notification settings 
- Fork 152
Closed
Description
Hi, I noticed while optimizing some code that the only allocating bit that was left was a cumsum over an SVector. Because I thought it was an interesting challenge, I tried to make my first @generated function to fix this. Because I feel like it's really bad style (didn't know how to do it with quoting and interpolating), I'm posting it here first, before making a PR..
So this is the default behavior:
>>> svec = SVector(1, 2, 3, 4, 5)
>>> @btime cumsum($svec)
13.754 ns (1 allocation: 48 bytes)
5-element MArray{Tuple{5},Int64,1,5} with indices SOneTo(5):
  1
  3
  6
 10
 15Now my function:
@generated function mycumsum(svec::SVector{N, T}) where {N, T}
    args = Any[]
    i = 1
    push!(args, :(v1 = svec[1]))
    for i in 2:N
        push!(args,
            Expr(
                Symbol("="),
                Symbol("v$i"),
                Expr(
                    Symbol("call"),
                    Symbol("+"),
                    Expr(
                        Symbol("ref"),
                        Symbol("svec"),
                        i
                    ),
                    Symbol("v$(i-1)")
                )
            )
        )
    end
    variables = [Symbol("v$i") for i in 1:N]
    push!(args, Expr(Symbol("call"), :SVector, variables...))
    exp = Expr(Symbol("block"), args...)
endThis is the expression that this function generates for svec:
exp = quote
    v1 = svec[1]
    v2 = svec[2] + v1
    v3 = svec[3] + v2
    v4 = svec[4] + v3
    v5 = svec[5] + v4
    SVector(v1, v2, v3, v4, v5)
endAnd this is the result when using it:
@btime mycumsum($svec)
  0.031 ns (0 allocations: 0 bytes)
5-element SArray{Tuple{5},Int64,1,5} with indices SOneTo(5):
  1
  3
  6
 10
 15So if I'm not mistaken, this is 444 times faster than the default version?
andyferris
Metadata
Metadata
Assignees
Labels
No labels