-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
Hi All,
I am a new Julia user and am trying to get performance improvement from using the @simd instructions in julia-0.3. I have the following function with a nested norm function call:
function prune_range!(DSMPerihelion::Array{Float64,2},SolarRadiiCutoff::Float64,OutInd::Array{Float64})
bySradii=1.0/SolarRadius
len = size(DSMPerihelion)[2]
r = Array(Float64,3)
for i = 1:len
r = DSMPerihelion[1:3,i]
OutInd[i] = norm(r)*bySradii
end
endIts runs on an input 2d array: DSMPerihelion of size (3 , 20979) and I get the following performance:
@time prune_range!(DSMPerihelion,SolarRadiiCutoff,DsmR)
elapsed time: 0.00252588 seconds (2181976 bytes allocated)
Now when I try to vectorize the code as follows:
function prune_range!(DSMPerihelion::Array{Float64,2},SolarRadiiCutoff::Float64,OutInd::Array{Float64})
bySradii=1.0/SolarRadius
len = size(DSMPerihelion)[2]
r = Array(Float64,3)
@simd for i = 1:len
@inbounds r = DSMPerihelion[1:3,i]
@inbounds OutInd[i] = norm(r)*bySradii
end
endI get almost the same performance:
elapsed time: 0.002716825 seconds (2181976 bytes allocated)
I looked at the code_llvm and the code doesn't seem to get vectorized. Is there something I am missing for vectorizing this code ? I followed the Intel Blog(https://software.intel.com/en-us/articles/vectorization-in-julia) by @ArchRobison to understand vectorization in Julia.
This function will be called millions of times so its imp. it performs well; hence my effort to vectorize it.
Thanks for the help.