Skip to content

Vectorization of a FOR loop using "@simd" with nested "norm" #11037

@GravityAssisted

Description

@GravityAssisted

Hi All,

I am a new Julia user and am trying to get performance improvement from using the @simd instructions in julia-0.3. I have the following function with a nested norm function call:

function prune_range!(DSMPerihelion::Array{Float64,2},SolarRadiiCutoff::Float64,OutInd::Array{Float64})
    bySradii=1.0/SolarRadius
    len = size(DSMPerihelion)[2]
    r = Array(Float64,3)
    for i = 1:len
      r = DSMPerihelion[1:3,i]
      OutInd[i] = norm(r)*bySradii
    end
end

Its runs on an input 2d array: DSMPerihelion of size (3 , 20979) and I get the following performance:

@time prune_range!(DSMPerihelion,SolarRadiiCutoff,DsmR)
elapsed time: 0.00252588 seconds (2181976 bytes allocated)

Now when I try to vectorize the code as follows:

function prune_range!(DSMPerihelion::Array{Float64,2},SolarRadiiCutoff::Float64,OutInd::Array{Float64})
    bySradii=1.0/SolarRadius
    len = size(DSMPerihelion)[2]
    r = Array(Float64,3)
    @simd for i = 1:len
      @inbounds r = DSMPerihelion[1:3,i]
      @inbounds OutInd[i] = norm(r)*bySradii
    end
end

I get almost the same performance:
elapsed time: 0.002716825 seconds (2181976 bytes allocated)

I looked at the code_llvm and the code doesn't seem to get vectorized. Is there something I am missing for vectorizing this code ? I followed the Intel Blog(https://software.intel.com/en-us/articles/vectorization-in-julia) by @ArchRobison to understand vectorization in Julia.

This function will be called millions of times so its imp. it performs well; hence my effort to vectorize it.
Thanks for the help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedIndicates that a maintainer wants help on an issue or pull requestperformanceMust go fasterpotential benchmarkCould make a good benchmark in BaseBenchmarks

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions