-
Notifications
You must be signed in to change notification settings - Fork 164
evo2 SAE recipe: live inference engine + steering server + CLI #1622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
polinabinder1
wants to merge
20
commits into
main
Choose a base branch
from
pbinder/evo2-sae-serve
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 7 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
538592f
evo2 SAE recipe: live inference engine + steering server + CLI (src/e…
polinabinder1 d6158e5
evo2 infer: default launch_inference.sh to the 7B/layer-26 model + it…
polinabinder1 2212289
evo2 infer: wrap encode forward in bf16 autocast (TransformerEngine d…
polinabinder1 92b7c85
evo2 infer: robust steering test (discovered active feature) + encode…
polinabinder1 c5e4d78
evo2 SAE recipe: add recipe README (run guide)
polinabinder1 d0450ff
evo2 README: clarify dashboard launch (--data-dir optional; atlas vs …
polinabinder1 de81106
evo2: drop premature recipe README from the inference PR
polinabinder1 374233f
Address review: required env config, header edge case, mode validation
polinabinder1 bb38064
launch_inference.sh: correct usage header (EVO2_CKPT_DIR/SAE_CKPT_PAT…
polinabinder1 4a0de59
Dedupe FASTA parsing into shared evo2_sae.fasta.read_fasta
polinabinder1 13ff76d
evo2 infer: drop test_empty_file from test_fasta
polinabinder1 46dae0d
evo2 serve: clamp with sae.steering (B) from the start — no throwaway…
polinabinder1 89168b4
evo2 serve: fold in the rest of steering (generate CLI + steer.py har…
polinabinder1 b37d334
evo2 serve: consolidate steer.py onto Evo2SAE.generate; drop fasta/cl…
polinabinder1 d1f888d
evo2 serve: move steer.py to its own steering-analysis PR
polinabinder1 1489955
evo2 serve: split server/CLI out -> inference engine only
polinabinder1 f310289
evo2 serve: steering safety — fix gen double-init + guard bad clamps
polinabinder1 65a2439
style(core): ruff-format blank line after in-function import (fix pre…
polinabinder1 a1f1d54
evo2 serve: fix temperature=0 NaN + factor steering guards into a tes…
polinabinder1 baf6ddf
evo2 serve: drop unused unpack var in steering guard test (lint)
polinabinder1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletions
31
...emo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/launch_inference.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| #!/bin/bash | ||
| # Launch the Evo2 SAE inference engine. One engine, three modes: | ||
| # | ||
| # ./launch_inference.sh serve # live HTTP server on :8001 (viz backend) | ||
| # ./launch_inference.sh encode --sequence ATGC... # annotate ONE sequence -> top features | ||
| # ./launch_inference.sh batch --fasta in.fa --out out.parquet # MANY sequences -> parquet | ||
| # | ||
| # Config via env (sensible defaults below): EVO2_CKPT_DIR, SAE_CKPT_PATH, | ||
| # FEATURE_ANNOTATIONS, EMBEDDING_LAYER, DEVICE, PORT, CUDA_VISIBLE_DEVICES. | ||
| # | ||
| # Requires the evo2_megatron recipe venv (provides bionemo.evo2 + megatron). | ||
| set -euo pipefail | ||
|
|
||
| HERE="$(cd "$(dirname "$0")" && pwd)" | ||
| RECIPE_DIR="$(cd "$HERE/.." && pwd)" # recipes/evo2 — so the evo2_sae package imports | ||
|
|
||
| VENV="${VENV:-/data/pbinder/bionemo-framework/bionemo-recipes/recipes/evo2_megatron/.venv}" | ||
| export EVO2_CKPT_DIR="${EVO2_CKPT_DIR:-/data/interp/evo2/checkpoints/evo2_7b_mbridge}" | ||
| export SAE_CKPT_PATH="${SAE_CKPT_PATH:-/data/interp/evo2/sae/v2_diverse/layer26_7B_ablate_normalize_input/checkpoints/checkpoint_final.pt}" | ||
| export FEATURE_ANNOTATIONS="${FEATURE_ANNOTATIONS:-/data/interp/evo2/sae_eval/dashboard_data/l26_7B_normalize/feature_metadata.parquet}" | ||
| export EMBEDDING_LAYER="${EMBEDDING_LAYER:-26}" | ||
|
|
||
| if [[ ! -x "$VENV/bin/python" ]]; then | ||
| echo "ERROR: evo2_megatron venv not found at $VENV (build it with the recipe's .ci_build.sh)" >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| source "$VENV/bin/activate" | ||
| cd "$RECIPE_DIR" | ||
| export PYTHONPATH="$RECIPE_DIR/src${PYTHONPATH:+:$PYTHONPATH}" | ||
| exec python -m evo2_sae.cli "$@" | ||
21 changes: 21 additions & 0 deletions
21
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/__init__.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: LicenseRef-Apache2 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """Evo2 + SAE inference engine — reused by the live server, the batch CLI, and the viz backend.""" | ||
|
|
||
| from .core import DEFAULT_ORGANISM_TAGS, Evo2SAE, clean_dna | ||
|
|
||
|
|
||
| __all__ = ["DEFAULT_ORGANISM_TAGS", "Evo2SAE", "clean_dna"] |
155 changes: 155 additions & 0 deletions
155
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/cli.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,155 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: LicenseRef-Apache2 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """Evo2 SAE inference CLI — one engine, three modes. | ||
|
|
||
| serve : start the FastAPI server (one sequence at a time, interactive) | ||
| encode : annotate ONE sequence -> top features (stdout JSON) | ||
| batch : run a FASTA of MANY sequences -> parquet of per-sequence top features | ||
|
|
||
| All three build the same `Evo2SAE` engine; config comes from flags or env | ||
| (EVO2_CKPT_DIR / SAE_CKPT_PATH / FEATURE_ANNOTATIONS / EMBEDDING_LAYER). | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import argparse | ||
| import gzip | ||
| import json | ||
| import os | ||
|
|
||
|
|
||
| def _add_common(p: argparse.ArgumentParser) -> None: | ||
| p.add_argument( | ||
| "--evo2-ckpt-dir", | ||
| default=os.environ.get("EVO2_CKPT_DIR", "/data/interp/evo2/checkpoints/evo2_1b_base_mbridge"), | ||
| ) | ||
| p.add_argument( | ||
| "--sae-ckpt-path", | ||
| default=os.environ.get( | ||
| "SAE_CKPT_PATH", "/data/interp/evo2/sae/v2_diverse/layer19_C13_nofilter/checkpoints/checkpoint_final.pt" | ||
| ), | ||
| ) | ||
| p.add_argument( | ||
| "--feature-annotations", | ||
| default=os.environ.get( | ||
| "FEATURE_ANNOTATIONS", | ||
| "/data/interp/evo2/sae_eval/dashboard_data/l19_C13_nofilter/feature_metadata.parquet", | ||
| ), | ||
| ) | ||
| p.add_argument("--layer", type=int, default=int(os.environ.get("EMBEDDING_LAYER", "19"))) | ||
| p.add_argument("--device", default=os.environ.get("DEVICE", "cuda")) | ||
| p.add_argument("--max-seq-len", type=int, default=int(os.environ.get("MAX_SEQ_LEN", "8192"))) | ||
|
|
||
|
polinabinder1 marked this conversation as resolved.
Outdated
|
||
|
|
||
| def _engine(args): | ||
| from .core import Evo2SAE | ||
|
|
||
| return Evo2SAE( | ||
| evo2_ckpt_dir=args.evo2_ckpt_dir, | ||
| sae_ckpt_path=args.sae_ckpt_path, | ||
| layer=args.layer, | ||
| device=args.device, | ||
| max_seq_len=args.max_seq_len, | ||
| feature_annotations=args.feature_annotations, | ||
| ) | ||
|
polinabinder1 marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
| def _read_fasta(path: str): | ||
| seqs, ids = [], [] | ||
| name, parts = None, [] | ||
| opener = gzip.open if str(path).endswith(".gz") else open | ||
| with opener(path, "rt") as f: | ||
| for line in f: | ||
| line = line.rstrip() | ||
| if line.startswith(">"): | ||
| if name is not None: | ||
| seqs.append("".join(parts)) | ||
| ids.append(name) | ||
| name, parts = line[1:].split()[0] if len(line) > 1 else f"seq_{len(ids)}", [] | ||
| else: | ||
| parts.append(line) | ||
| if name is not None: | ||
| seqs.append("".join(parts)) | ||
| ids.append(name) | ||
| return ids, seqs | ||
|
polinabinder1 marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
| def main(): | ||
| """Parse args and dispatch to the serve / encode / batch subcommand.""" | ||
| ap = argparse.ArgumentParser(description="Evo2 SAE inference (serve | encode | batch)") | ||
| sub = ap.add_subparsers(dest="cmd", required=True) | ||
|
|
||
| ps = sub.add_parser("serve", help="start the FastAPI inference server") | ||
| _add_common(ps) | ||
| ps.add_argument("--host", default="0.0.0.0") | ||
| ps.add_argument("--port", type=int, default=int(os.environ.get("PORT", "8001"))) | ||
|
|
||
| pe = sub.add_parser("encode", help="annotate ONE sequence -> top features (JSON)") | ||
| _add_common(pe) | ||
| pe.add_argument("--sequence", required=True) | ||
| pe.add_argument("--organism", default="None (raw DNA)") | ||
| pe.add_argument("--top-k", type=int, default=8) | ||
|
|
||
| pb = sub.add_parser("batch", help="MANY sequences (FASTA) -> parquet of per-sequence top features") | ||
| _add_common(pb) | ||
| pb.add_argument("--fasta", required=True) | ||
| pb.add_argument("--out", required=True) | ||
| pb.add_argument("--top-k", type=int, default=16) | ||
| pb.add_argument("--batch-size", type=int, default=8) | ||
|
|
||
| args = ap.parse_args() | ||
|
|
||
| if args.cmd == "serve": | ||
| import uvicorn | ||
|
|
||
| from .server import build_app | ||
|
|
||
| uvicorn.run(build_app(_engine(args)), host=args.host, port=args.port, log_level="info") | ||
| return | ||
|
|
||
| from .core import clean_dna | ||
|
|
||
| eng = _engine(args).load() | ||
|
|
||
| if args.cmd == "encode": | ||
| tag = eng.resolve_tag(args.organism, None) or "" | ||
| dna = clean_dna(args.sequence) | ||
| codes = eng.encode(tag + dna) | ||
| tag_len = len(tag) if codes.shape[0] >= len(tag) else 0 | ||
| feats = eng.top_features(codes, tag_len=tag_len, k=args.top_k) | ||
| print( | ||
| json.dumps( | ||
| {"sequence": dna, "organism": args.organism, "bases": len(dna), "top_features": feats}, indent=2 | ||
| ) | ||
| ) | ||
|
|
||
| elif args.cmd == "batch": | ||
| import pandas as pd | ||
|
|
||
| ids, seqs = _read_fasta(args.fasta) | ||
| print(f"[batch] {len(seqs)} sequences from {args.fasta}; encoding (batch_size={args.batch_size})…") | ||
| codes_list = eng.encode_batch(seqs, batch_size=args.batch_size) | ||
| rows = [] | ||
| for sid, codes in zip(ids, codes_list): | ||
| for rank, ft in enumerate(eng.top_features(codes, k=args.top_k)): | ||
| rows.append({"sequence_id": sid, "bp": int(codes.shape[0]), "rank": rank, **ft}) | ||
| df = pd.DataFrame(rows) | ||
| df.to_parquet(args.out, index=False) | ||
| print(f"[batch] wrote {len(df)} rows for {len(seqs)} sequences -> {args.out}") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded development-specific paths will break for other users.
The default paths for
VENV,EVO2_CKPT_DIR,SAE_CKPT_PATH, andFEATURE_ANNOTATIONSare hardcoded to specific locations under/data/pbinder/and/data/interp/that won't exist on other machines or in CI. While these can be overridden via environment variables, the script will fail immediately for any new user who tries to run it without knowing the exact environment setup required.Consider one of these approaches:
RECIPE_DIR/.venvfor VENV)🔧 Example: Require explicit environment variables
🤖 Prompt for AI Agents