-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
Julia has the following performance problem with array accesses.
Consider the following code snippets.
# c and x are arrays of size n
function optimized(c, x)
n = size(x, 1)
x_ = x[n]
for i = n-1:-1:1
x_ = x[i] = (x[i] - c[i + 1] * x_)
end
end
function unoptimized(c, x)
n = size(x, 1)
for i = n-1:-1:1
x[i] = (x[i] - c[i + 1] * x[i + 1])
end
endBoth functions do the same computation.
There is a loop-carry dependency in this code, that is, x [i] depends on the value of x [i+1].
While "optimized" forces reading x [i+1] from the local variable x_, "unoptimized" simply accesses x [i+1] .
For n = 100 million, and for 100 runs of each function on my macbook air, i have the following runtimes.
optimized : min 0.307929443 max 0.448113027 mean 0.31504216288000003 variance 0.00036038772082227653
unoptimized : min 0.508484855 max 0.674296101 mean 0.54576100111 variance 0.001543397115546402
Similar runtimes were obtained from using julia -O.
I haven't looked at the assembly code, but it seems that the unoptimized version loads x[i+1] from memory though x [i+1] was just computed in the previous iteration.