feat: report golf compile cost (heartbeats + time) in /check-golf#1356
feat: report golf compile cost (heartbeats + time) in /check-golf#1356Vilin97 wants to merge 2 commits into
Conversation
Add scripts/check_golf.py and a check-golf GitHub workflow. When a `/check-golf` comment is posted on a PR, the bot compares the PR's base and head, checks that no declaration statement (signature/type) changed and only proofs/bodies changed, and upserts a single findings comment (posting it once, editing it on later runs). The Lean sources are parsed textually (no build): comments and whitespace are ignored, declarations are segmented on column-0 command starts, each statement is split from its body at the top-level `:=`, `where`, or equation `|` arm, and `by` proof terms embedded inside a type are treated as proof-irrelevant. Co-authored-by: Claude Opus 4.8 <no-reply+claude-opus-4-8@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Vjiv6nckrp5cJT9rpgZaKT
Extend check_golf.py with `--measure`: for every changed file it compiles the base and head versions and diffs their heartbeats and wall-clock time, adding a "Compile cost" section to the PR comment. Heartbeats are measured with Mathlib's `#count_heartbeats in`, which is only accurate under `Elab.async false` (otherwise the proof elaborates in a background task the counter cannot see). The instrumenter forces synchronous elaboration and prefixes the counter before each declaration, placing it above any attached doc comment so the declaration stays well-formed. The check-golf workflow now builds the head revision (`lake exe cache get` + `lake build`) before measuring, so the base sources can be compiled against the built environment. Co-authored-by: Claude Opus 4.8 <no-reply+claude-opus-4-8@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Vjiv6nckrp5cJT9rpgZaKT
|
Thank you for this PR, which will now be reviewed. If submitting to ./Physlib or ./QuantumInfo, please see our review guidelines if you are not familiar with the process. You should expect a back and forth with a reviewer before your PR is merged. See also that link for how to add appropriate labels to your PR. The PR will also go through a number of automated checks. You can learn more about these here, including how to run them locally. If you are submitting to ./PhyslibAlpha there will be a lighter review process, though your PR must still pass the automated checks. If you want to bring attention to this PR, please write a message on this thread of the Lean Zulip. Important: If a reviewer adds an |
Stacked on #1355 — please review that first. This PR adds one commit on top of #1355; its net diff over
mastertherefore also contains #1355's script. Once #1355 merges, this becomes a small delta.What
Extends the
/check-golfbot so its comment also reports the compile cost of a golf: for every changed file it compiles the base and head versions and diffs their heartbeats and wall-clock time.Changes (on top of #1355)
scripts/check_golf.py: new--measureflag.Mathlib.Util.CountHeartbeats, forcingset_option Elab.async false, and prefixing#count_heartbeats inbefore every declaration (placed above any attached doc comment so the declaration stays well-formed)..github/workflows/check-golf.yml: builds the head revision (lake exe cache get+lake build) before measuring, then runs with--measure. The base sources are compiled against the built head environment, which is sound because the check has already verified the statements are unchanged.scripts/README.md: documents--measure.Why the async gymnastics
#count_heartbeatsonly sees ~0 heartbeats under Lean's default async elaboration (the proof runs in a background task).set_option Elab.async falsemakes the count accurate — confirmed by measuring real proofs (e.g. aringproof reports ~1.5k heartbeats instead of ~40).Testing
Validated locally on golfed files, e.g.
HarmonicOscillator/Solution.lean: base 71,914 → head 71,979 heartbeats (+65) and 7.39s → 7.17s — i.e. this golf left compile cost essentially unchanged.