tblgen: add ResultIndex primitive#2912
Conversation
|
ideally mincut can also identify equivalence to merge nodes in the graph but this can still be interesting to write more memory optimized rules |
|
I'm slightly worried about this [though less so for EnzymeMLIR than Enzyme proper] because the analysis to detect if a value is needed for the reverse looks at users of the value, and doesn't consider "itself" as a potential user. In principle is there a reason why re-running the op wouldn't suffice (and CSE-style optimizations won't make it equivalent perf?] |
the push / pop mincut does not currently treat the recomputation as equivalent and as such will always choose to cache the operand of the original op instead of its result. scf.for {
enzyme.push %v
%0 = math.exp %v
...
}
scf.for {
%v = enzyme.pop
%dx = math.exp %v
...
}could be scf.for {
%0 = math.exp %v
enzyme.push %0
...
}
scf.for {
%dx = enzyme.pop
...
}because the graph looks something like this so it is less costly to store %v than its result |
|
That makes sense, though I feel like we should min cut here (and that would regardless be useful for other things too) |
to refer to the result value of an operation in the tablegen rules.
cc @mofeing, as we had discussed about this.