Cache subexpressions when building MOI.ScalarNonlinearExpression#4032
Cache subexpressions when building MOI.ScalarNonlinearExpression#4032blegat wants to merge 15 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #4032 +/- ##
==========================================
- Coverage 99.91% 99.79% -0.13%
==========================================
Files 42 42
Lines 6229 6250 +21
==========================================
+ Hits 6224 6237 +13
- Misses 5 13 +8 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
In theory we don't, but the can be. This change needs very very careful consideration. |
|
Yes, it has nontrivial consequences. It may very well jeopardize our chances of adding a modification API for nonlinear constraints in the future. |
|
I think we can simplify this to: function my_moi_function(f::GenericNonlinearExpr{V}) where {V}
cache = Dict{UInt64,MOI.ScalarNonlinearFunction}()
ret = MOI.ScalarNonlinearFunction(f.head, similar(f.args))
stack = Tuple{MOI.ScalarNonlinearFunction,Int,GenericNonlinearExpr{V}}[]
for i in length(f.args):-1:1
if f.args[i] isa GenericNonlinearExpr{V}
push!(stack, (ret, i, f.args[i]))
else
ret.args[i] = moi_function(f.args[i])
end
end
while !isempty(stack)
parent, i, arg = pop!(stack)
parent.args[i] = get!(cache, objectid(arg)) do
child = MOI.ScalarNonlinearFunction(arg.head, similar(arg.args))
for j in length(arg.args):-1:1
if arg.args[j] isa GenericNonlinearExpr{V}
push!(stack, (child, j, arg.args[j]))
else
child.args[j] = moi_function(arg.args[j])
end
end
return child
end
end
return ret
endIt doesn't hold state across multiple calls to |
|
The example gives: julia> @time moi_function.(x);
0.000142 seconds (1.53 k allocations: 106.688 KiB) |
|
If this was only about |
|
We can fix |
|
It would be weird to have the subexpressions only work when they are on the same function. Especially since the AD backend supports them being on different ones. I guess we can still find out they are the same later in post-processing, it could be enough just for the sake of speeding up passing the model to the AD backend without the exponential blowup. |
This is an implementation detail. Nothing should be observable at the JuMP level.
This is also an implementation detail.
This is my strongly preferred option. I would like us to provide the full tape to an AD engine, and then it to turn everything into a single global DAG.
Yes, we definitely need benchmarks before merging anything like this
It turns every nonlinear expression into a long-lived GC object that will never be freed until the model is. Second, any expression occurring in multiple constraints seems like much less of an issue than the original example, where nested expressions are programmatically created. Third, if we start with the internal cache, we can always change to a model-level cache later if needed. |
|
That makes sense, if we only take care of not duplicating aliased subexpression at the level of each function, it's indeed easier. |
0bd8a48 to
0cb99f5
Compare
|
To avoid having the JuMP objects continue living, we can us their hash as keys instead of the objects themselves |
|
#3729 (comment) is a good benchmark but we need more. |
|
I added benchmarks on the top comment. |
Here is a simplified reproducer of the performance issue identified in #4024:
Before this PR, this gives
After this PR, this gives
Note that the
MOI.ScalarNonlinearFunctionwe generate now share common sub-expression by pointers! If I'm not mistaken, we don't support modifying these in-place so that shouldn't be an issue.This fixes the slow model-building issue but ReverseAD will still be terribly slow. One thing we could do is to detect, in
MOI.Nonlinear, theMOI.ScalarNonlinearFunctionthat share sub-functions correponding to the same object (with a dictionary). When it detects two functions with the same pointer, it can then create subexpressions and use its existing support for subexpressions.The nice thing about it is that we're not doing any change to the MOI interface. Creating
MOI.ScalarNonlinearFunctionwas already possible from the beginning, we just didn't treat it any differently. So this change will just be a performance optimization of the AD but the interface is as simple and non-breaking as it gets.Results on other benchmarks
For this benchmark, cherry-pick both jump-dev/MathOptInterface.jl#2803 and jump-dev/MathOptInterface.jl#3008. We don't know yet what's the right fix at the MOI level, these PRs just make sure we only see the JuMP-side in the timing.
Then, execute
test/perf/benchmark_cache.jl.The conclusion of the benchmark seems to be that a global dict is better than a new dict per constraint and also that we have a small win to hash the
ObjectId.The benchmark was generated by Claude as a neutral judge.
A: aliased tree (one big DAG, K=16)
Current
One dict per constraint
Global dict
Global dict + object id
B: many independent constraints (N=5000)
Current
One dict per constraint
Global dict
Global dict + object id
C: shared big subexpr (N=200, M=200)
Current
One dict per constraint
Global dict
Global dict + object id
D: many aliased trees (M=200, K=8)
Current
One dict per constraint
Global dict
Global dict + object id