Skip to content

OutOfMemory error as Optim.jl interface allocates a hessian for first order methods' #893

@jrbp

Description

@jrbp

,Describe the bug 🐞

I have a function with a large number of parameters where I can compute the
objective and the gradient just fine, but when using solve with a first order
method from OptimizationOptimJL such as ConjugateGradient I get an OutOfMemory
error. This can be worked around by giving an empty sparse matrix as the
hess_prototype. As long as the hessian is unused by the solver this doesn't impact
results; this seems to be the case, but it's possible I'm missing something.

,Expected behavior

Not allocating a hessian in the cache for first order methods, especially when the OptimizationProblem doesn't have a hessian.

,Minimal Reproducible Example 👇

using SparseArrays: spzeros
using Optimization: OptimizationFunction, OptimizationProblem, solve
using OptimizationOptimJL: ConjugateGradient

f(u, _) = sum(x->x^2, u) / 2
df!(g, u, _) = copy!(g, u)
function solveme(ulength, fakehes)
    u = ones(Float64, ulength)
    p = π
    optf = if fakehes
        OptimizationFunction(f; grad=df!, hess_prototype=spzeros(Float64, ulength, ulength))
    else
        OptimizationFunction(f; grad=df!)
    end
    prob = OptimizationProblem(optf, u, p)
    solve(prob, ConjugateGradient(); maxiters=2)
end

function main()
    small, big = 100, 2 * round(Int, sqrt(Sys.total_memory() ÷ sizeof(Float64)))

    sol_fakehes_small = solveme(small, true)
    fakehes_allocated_small = @allocated solveme(small, true)
    println("fakehes small allocated $(fakehes_allocated_small)")

    sol_default_small = solveme(small, false)
    allocated_default_small = @allocated solveme(small, false)
    println("default small allocated $(allocated_default_small)")

    println("small: same u min? $(sol_default_small.u  sol_fakehes_small.u)")

    sol_fakehes_big = solveme(big, true)
    fakehes_allocated_big = @allocated solveme(big, true)
    println("fakehes big allocated $(fakehes_allocated_big)")

    sol_default_big = solveme(big, false) # errors
end

main()

,Error & Stacktrace ⚠️

fakehes small allocated 30992
default small allocated 186480
small: same u min? true
fakehes big allocated 22700496
ERROR: LoadError: OutOfMemoryError()
Stacktrace:
  [1] GenericMemory
    @ ./boot.jl:516 [inlined]
  [2] new_as_memoryref
    @ ./boot.jl:535 [inlined]
  [3] Array
    @ ./boot.jl:582 [inlined]
  [4] Array
    @ ./boot.jl:592 [inlined]
  [5] Array
    @ ./boot.jl:599 [inlined]
  [6] similar
    @ ./abstractarray.jl:868 [inlined]
  [7] similar
    @ ./abstractarray.jl:867 [inlined]
  [8] similar
    @ ./broadcast.jl:224 [inlined]
  [9] similar
    @ ./broadcast.jl:223 [inlined]
 [10] copy
    @ ./broadcast.jl:897 [inlined]
 [11] materialize
    @ ./broadcast.jl:872 [inlined]
 [12] broadcast(::typeof(*), ::Vector{Float64}, ::LinearAlgebra.Adjoint{Floa
t64, Vector{Float64}})
    @ Base.Broadcast ./broadcast.jl:810
 [13] *
    @ /nix/store/izjf0fnx3z4sy23q4n5v1fg129aar7z2-julia-bin-1.11.3/share/jul
ia/stdlib/v1.11/LinearAlgebra/src/adjtrans.jl:484 [inlined]
 [14] alloc_H(x::Vector{Float64}, F::Float64)
    @ NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/objective_type
s/abstract.jl:25
 [15] __solve(cache::OptimizationBase.OptimizationCache{OptimizationFunction
{true, SciMLBase.NoAD, typeof(f), OptimizationBase.var"#grad#204"{Irrational
{}, OptimizationFunction{true, SciMLBase.NoAD, typeof(f), typeof(df!), Not
hing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing
, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED_NO_TIME), Not
hing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing
, Nothing}}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, 
Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERV
ED_NO_TIME), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, 
Nothing, Nothing, Nothing}, OptimizationBase.ReInitCache{Vector{Float64}, Ir
rational{}}, Nothing, Nothing, Nothing, Nothing, Nothing, ConjugateGradien
t{Float64, Nothing, Optim.var"#32#34", LineSearches.InitialHagerZhang{Float6
4}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}}, Bool, Optimizati
onOptimJL.var"#4#6", Nothing})
    @ OptimizationOptimJL ~/.julia/packages/OptimizationOptimJL/e3bUa/src/Op
timizationOptimJL.jl:200
 [16] solve!
    @ ~/.julia/packages/SciMLBase/sYmAV/src/solve.jl:187 [inlined]
 [17] #solve#725
    @ ~/.julia/packages/SciMLBase/sYmAV/src/solve.jl:95 [inlined]
 [18] solve
    @ ~/.julia/packages/SciMLBase/sYmAV/src/solve.jl:92 [inlined]
 [19] solveme(ulength::Int64, fakehes::Bool)
    @ Main ~/tmp/nohessian_mwe/mwe.jl:16
 [20] main()
    @ Main ~/tmp/nohessian_mwe/mwe.jl:36
 [21] top-level scope
    @ ~/tmp/nohessian_mwe/mwe.jl:39
in expression starting at /home/jrb26/tmp/nohessian_mwe/mwe.jl:39

,Environment (please complete the following information):

  • Output of using Pkg; Pkg.status()
Status `~/tmp/nohessian_mwe/Project.toml`
⌃ [7f7a1694] Optimization v4.1.1
  [36348300] OptimizationOptimJL v0.4.1
  [2f01184e] SparseArrays v1.11.0
Info Packages marked with ⌃ have new versions available and may be upgradable.
  • Output of using Pkg; Pkg.status(; mode = PKGMODE_MANIFEST)
Status `~/tmp/nohessian_mwe/Manifest.toml`
⌃ [47edcb42] ADTypes v1.13.0
  [1520ce14] AbstractTrees v0.4.5
  [7d9f7c33] Accessors v0.1.42
⌃ [79e6a3ab] Adapt v4.2.0
  [66dad0bd] AliasTables v1.1.3
  [4fba245c] ArrayInterface v7.18.0
  [38540f10] CommonSolve v0.2.4
  [bbf7d656] CommonSubexpressions v0.3.1
  [34da2185] Compat v4.16.0
  [a33af91c] CompositionsBase v0.1.2
  [88cd18e8] ConsoleProgressMonitor v0.1.2
  [187b0558] ConstructionBase v1.5.8
  [9a962f9c] DataAPI v1.16.0
⌃ [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
⌃ [a0c0ee7d] DifferentiationInterface v0.6.42
⌃ [ffbed154] DocStringExtensions v0.9.3
⌃ [4e289a0a] EnumX v1.0.4
  [e2ba6199] ExprTools v0.1.10
  [55351af7] ExproniconLite v0.10.14
  [9aa1b823] FastClosures v0.3.2
  [1a297f60] FillArrays v1.13.0
  [6a86dc24] FiniteDiff v2.27.0
⌅ [f6369f11] ForwardDiff v0.10.38
  [069b7b12] FunctionWrappers v1.1.3
  [77dc65aa] FunctionWrappersWrappers v0.1.3
  [46192b85] GPUArraysCore v0.2.0
  [3587e190] InverseFunctions v0.1.17
  [92d709cd] IrrationalConstants v0.2.4
  [82899510] IteratorInterfaceExtensions v1.0.0
  [692b3bcd] JLLWrappers v1.7.0
  [ae98c720] Jieko v0.2.1
  [5be7bae1] LBFGSB v0.4.1
  [1d6d02ad] LeftChildRightSiblingTrees v0.2.0
  [d3d80556] LineSearches v7.3.0
  [2ab3a3ac] LogExpFunctions v0.3.29
  [e6f89c97] LoggingExtras v1.1.0
  [1914dd2f] MacroTools v0.5.15
  [e1d29d7a] Missings v1.2.0
  [2e0e35c7] Moshi v0.3.5
⌃ [d41bc354] NLSolversBase v7.8.3
⌃ [77ba4419] NaNMath v1.1.2
⌃ [429524aa] Optim v1.11.0
⌃ [7f7a1694] Optimization v4.1.1
⌃ [bca83a33] OptimizationBase v2.4.0
  [36348300] OptimizationOptimJL v0.4.1
  [bac558e1] OrderedCollections v1.8.0
⌃ [90014a1f] PDMats v0.11.32
  [d96e819e] Parameters v0.12.3
  [85a6dd25] PositiveFactorizations v0.2.4
⌅ [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [33c8b6b6] ProgressLogging v0.1.4
  [92933f4c] ProgressMeter v1.10.2
  [43287f4e] PtrArrays v1.3.0
  [3cdcf5f2] RecipesBase v1.3.4
⌃ [731186ca] RecursiveArrayTools v3.31.0
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.1
  [7e49a35a] RuntimeGeneratedFunctions v0.5.13
⌃ [0bca4576] SciMLBase v2.75.1
⌃ [c0aeaf25] SciMLOperators v0.3.12
⌃ [53ae85a6] SciMLStructures v1.6.1
  [efcf1570] Setfield v1.1.2
  [a2af1166] SortingAlgorithms v1.2.1
⌃ [9f842d2f] SparseConnectivityTracer v0.6.13
⌃ [0a514795] SparseMatrixColorings v0.4.13
  [276daf66] SpecialFunctions v2.5.0
  [1e83bf80] StaticArraysCore v1.4.3
  [10745b16] Statistics v1.11.1
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.4
  [2efcf032] SymbolicIndexingInterface v0.3.38
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [5d786b92] TerminalLoggers v0.1.7
  [3a884ed6] UnPack v1.0.2
  [81d17ec3] L_BFGS_B_jll v3.0.1+0
  [efe28fd5] OpenSpecFun_jll v0.5.6+0
  [0dad84c5] ArgTools v1.1.2
  [56f22d72] Artifacts v1.11.0
  [2a0f44e3] Base64 v1.11.0
  [ade2ca70] Dates v1.11.0
  [8ba89e20] Distributed v1.11.0
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching v1.11.0
  [9fa8497b] Future v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2 v1.11.0
  [8f399da3] Libdl v1.11.0
  [37e2e46d] LinearAlgebra v1.11.0
  [56ddb016] Logging v1.11.0
  [d6f4376e] Markdown v1.11.0
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.11.0
  [de0858da] Printf v1.11.0
  [9a3f8284] Random v1.11.0
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization v1.11.0
  [6462fe0b] Sockets v1.11.0
  [2f01184e] SparseArrays v1.11.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [cf7118a7] UUIDs v1.11.0
  [4ec0a83e] Unicode v1.11.0
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [deac9b47] LibCURL_jll v8.6.0+0
  [e37daf67] LibGit2_jll v1.7.2+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.6+0
  [14a3606d] MozillaCACerts_jll v2023.12.12
  [4536629a] OpenBLAS_jll v0.3.27+1
  [05823500] OpenLibm_jll v0.8.1+2
  [bea87d4a] SuiteSparse_jll v7.7.0+0
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.11.0+0
  [8e850ede] nghttp2_jll v1.59.0+0
  [3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may b
e upgradable, but those with ⌅ are restricted by compatibility constraints from u
pgrading. To see why use `status --outdated -m`
  • Output of versioninfo()
Julia Version 1.11.3
Commit d63adeda50d (2025-01-21 19:42 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 12 default, 0 interactive, 6 GC (on 12 virtual cores)
Environment:
  LD_LIBRARY_PATH = /run/opengl-driver/lib
  JULIA_PROJECT = /home/jrb26/tmp/nohessian_mwe
  JULIA_NUM_THREADS = auto
  JULIA_PKG_PRESERVE_TIERED_INSTALLED = true

,Additional context

From a quick glance it looks as though in the OptimizationOtpimJL interface most Optim.jl methods construct a TwiceDifferentiable function even if the algorithm is only <: FirstOrderOptimizer and the OptimizationFunction doesn't have a hessian specified. If I'm understanding correctly maybe it makes sense to have a __solve method for <:FirstOrderOptimiser algorithms so that a OnceDifferentiable is created instead.

edit: changed mre to not catch the error, as it was suppressing the stack trace

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions