Skip to content
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
538592f
evo2 SAE recipe: live inference engine + steering server + CLI (src/e…
polinabinder1 Jun 10, 2026
d6158e5
evo2 infer: default launch_inference.sh to the 7B/layer-26 model + it…
polinabinder1 Jun 10, 2026
2212289
evo2 infer: wrap encode forward in bf16 autocast (TransformerEngine d…
polinabinder1 Jun 10, 2026
92b7c85
evo2 infer: robust steering test (discovered active feature) + encode…
polinabinder1 Jun 10, 2026
c5e4d78
evo2 SAE recipe: add recipe README (run guide)
polinabinder1 Jun 10, 2026
d0450ff
evo2 README: clarify dashboard launch (--data-dir optional; atlas vs …
polinabinder1 Jun 10, 2026
de81106
evo2: drop premature recipe README from the inference PR
polinabinder1 Jun 10, 2026
374233f
Address review: required env config, header edge case, mode validation
polinabinder1 Jun 11, 2026
bb38064
launch_inference.sh: correct usage header (EVO2_CKPT_DIR/SAE_CKPT_PAT…
polinabinder1 Jun 11, 2026
4a0de59
Dedupe FASTA parsing into shared evo2_sae.fasta.read_fasta
polinabinder1 Jun 11, 2026
13ff76d
evo2 infer: drop test_empty_file from test_fasta
polinabinder1 Jun 11, 2026
46dae0d
evo2 serve: clamp with sae.steering (B) from the start — no throwaway…
polinabinder1 Jun 11, 2026
89168b4
evo2 serve: fold in the rest of steering (generate CLI + steer.py har…
polinabinder1 Jun 11, 2026
b37d334
evo2 serve: consolidate steer.py onto Evo2SAE.generate; drop fasta/cl…
polinabinder1 Jun 11, 2026
d1f888d
evo2 serve: move steer.py to its own steering-analysis PR
polinabinder1 Jun 12, 2026
1489955
evo2 serve: split server/CLI out -> inference engine only
polinabinder1 Jun 12, 2026
f310289
evo2 serve: steering safety — fix gen double-init + guard bad clamps
polinabinder1 Jun 12, 2026
65a2439
style(core): ruff-format blank line after in-function import (fix pre…
polinabinder1 Jun 12, 2026
a1f1d54
evo2 serve: fix temperature=0 NaN + factor steering guards into a tes…
polinabinder1 Jun 12, 2026
baf6ddf
evo2 serve: drop unused unpack var in steering guard test (lint)
polinabinder1 Jun 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ dependencies = [
"torch>=2.0",
"numpy>=1.20",
"pyarrow>=23.0.0",
"fastapi>=0.110",
"uvicorn>=0.29",
"pandas>=1.5",
]

# No package code lives here yet — the recipe is just an entry-point for
# scripts/ that depends on the shared `sae` workspace package. Declare no
# packages so setuptools doesn't try to discover anything.
[tool.setuptools]
packages = []
# The `evo2_sae` package (src/) holds the live inference engine + server + CLI;
# scripts/ (extract, train) are standalone entry points alongside it.
[tool.setuptools.packages.find]
where = ["src"]

[tool.uv.sources]
sae = { workspace = true }
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash
# Launch the Evo2 SAE inference engine. One engine, three modes:
#
# ./launch_inference.sh serve # live HTTP server on :8001 (viz backend)
# ./launch_inference.sh encode --sequence ATGC... # annotate ONE sequence -> top features
# ./launch_inference.sh batch --fasta in.fa --out out.parquet # MANY sequences -> parquet
#
# Config via env (sensible defaults below): EVO2_CKPT_DIR, SAE_CKPT_PATH,
# FEATURE_ANNOTATIONS, EMBEDDING_LAYER, DEVICE, PORT, CUDA_VISIBLE_DEVICES.
#
# Requires the evo2_megatron recipe venv (provides bionemo.evo2 + megatron).
set -euo pipefail

HERE="$(cd "$(dirname "$0")" && pwd)"
RECIPE_DIR="$(cd "$HERE/.." && pwd)" # recipes/evo2 — so the evo2_sae package imports

VENV="${VENV:-/data/pbinder/bionemo-framework/bionemo-recipes/recipes/evo2_megatron/.venv}"
export EVO2_CKPT_DIR="${EVO2_CKPT_DIR:-/data/interp/evo2/checkpoints/evo2_7b_mbridge}"
export SAE_CKPT_PATH="${SAE_CKPT_PATH:-/data/interp/evo2/sae/v2_diverse/layer26_7B_ablate_normalize_input/checkpoints/checkpoint_final.pt}"
export FEATURE_ANNOTATIONS="${FEATURE_ANNOTATIONS:-/data/interp/evo2/sae_eval/dashboard_data/l26_7B_normalize/feature_metadata.parquet}"
export EMBEDDING_LAYER="${EMBEDDING_LAYER:-26}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Hardcoded development-specific paths will break for other users.

The default paths for VENV, EVO2_CKPT_DIR, SAE_CKPT_PATH, and FEATURE_ANNOTATIONS are hardcoded to specific locations under /data/pbinder/ and /data/interp/ that won't exist on other machines or in CI. While these can be overridden via environment variables, the script will fail immediately for any new user who tries to run it without knowing the exact environment setup required.

Consider one of these approaches:

  1. Remove hardcoded defaults and require users to set environment variables explicitly
  2. Use relative paths where possible (e.g., RECIPE_DIR/.venv for VENV)
  3. Add a setup/configuration step documented in a README that creates a config file with user-specific paths
  4. Fail with a helpful error if required env vars are not set, rather than falling back to invalid defaults
🔧 Example: Require explicit environment variables
-VENV="${VENV:-/data/pbinder/bionemo-framework/bionemo-recipes/recipes/evo2_megatron/.venv}"
-export EVO2_CKPT_DIR="${EVO2_CKPT_DIR:-/data/interp/evo2/checkpoints/evo2_7b_mbridge}"
-export SAE_CKPT_PATH="${SAE_CKPT_PATH:-/data/interp/evo2/sae/v2_diverse/layer26_7B_ablate_normalize_input/checkpoints/checkpoint_final.pt}"
-export FEATURE_ANNOTATIONS="${FEATURE_ANNOTATIONS:-/data/interp/evo2/sae_eval/dashboard_data/l26_7B_normalize/feature_metadata.parquet}"
+# Require users to set these environment variables
+: "${VENV:?ERROR: VENV must be set to the evo2_megatron recipe venv path}"
+: "${EVO2_CKPT_DIR:?ERROR: EVO2_CKPT_DIR must be set to the Evo2 checkpoint directory}"
+: "${SAE_CKPT_PATH:?ERROR: SAE_CKPT_PATH must be set to the SAE checkpoint path}"
+: "${FEATURE_ANNOTATIONS:?ERROR: FEATURE_ANNOTATIONS must be set to the feature metadata parquet path}"
+export EVO2_CKPT_DIR SAE_CKPT_PATH FEATURE_ANNOTATIONS
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/launch_inference.sh`
around lines 17 - 21, The script currently embeds development-only absolute
defaults for VENV, EVO2_CKPT_DIR, SAE_CKPT_PATH, and FEATURE_ANNOTATIONS which
will break elsewhere; remove those hardcoded paths and instead either (a) set
VENV to a relative default like RECIPE_DIR/.venv and leave
EVO2_CKPT_DIR/SAE_CKPT_PATH/FEATURE_ANNOTATIONS unset, or (b) require these env
vars be provided and add an explicit validation block that checks VENV,
EVO2_CKPT_DIR, SAE_CKPT_PATH, and FEATURE_ANNOTATIONS (while allowing
EMBEDDING_LAYER to keep a sane numeric default), and if any are missing print a
clear error naming the missing variable(s) and exit non‑zero; update the code
references to VENV, EVO2_CKPT_DIR, SAE_CKPT_PATH, FEATURE_ANNOTATIONS, and
EMBEDDING_LAYER accordingly.


if [[ ! -x "$VENV/bin/python" ]]; then
echo "ERROR: evo2_megatron venv not found at $VENV (build it with the recipe's .ci_build.sh)" >&2
exit 1
fi

source "$VENV/bin/activate"
cd "$RECIPE_DIR"
export PYTHONPATH="$RECIPE_DIR/src${PYTHONPATH:+:$PYTHONPATH}"
exec python -m evo2_sae.cli "$@"
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: LicenseRef-Apache2
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Evo2 + SAE inference engine — reused by the live server, the batch CLI, and the viz backend."""

from .core import DEFAULT_ORGANISM_TAGS, Evo2SAE, clean_dna


__all__ = ["DEFAULT_ORGANISM_TAGS", "Evo2SAE", "clean_dna"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: LicenseRef-Apache2
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Evo2 SAE inference CLI — one engine, three modes.

serve : start the FastAPI server (one sequence at a time, interactive)
encode : annotate ONE sequence -> top features (stdout JSON)
batch : run a FASTA of MANY sequences -> parquet of per-sequence top features

All three build the same `Evo2SAE` engine; config comes from flags or env
(EVO2_CKPT_DIR / SAE_CKPT_PATH / FEATURE_ANNOTATIONS / EMBEDDING_LAYER).
"""

from __future__ import annotations

import argparse
import gzip
import json
import os


def _add_common(p: argparse.ArgumentParser) -> None:
p.add_argument(
"--evo2-ckpt-dir",
default=os.environ.get("EVO2_CKPT_DIR", "/data/interp/evo2/checkpoints/evo2_1b_base_mbridge"),
)
p.add_argument(
"--sae-ckpt-path",
default=os.environ.get(
"SAE_CKPT_PATH", "/data/interp/evo2/sae/v2_diverse/layer19_C13_nofilter/checkpoints/checkpoint_final.pt"
),
)
p.add_argument(
"--feature-annotations",
default=os.environ.get(
"FEATURE_ANNOTATIONS",
"/data/interp/evo2/sae_eval/dashboard_data/l19_C13_nofilter/feature_metadata.parquet",
),
)
p.add_argument("--layer", type=int, default=int(os.environ.get("EMBEDDING_LAYER", "19")))
p.add_argument("--device", default=os.environ.get("DEVICE", "cuda"))
p.add_argument("--max-seq-len", type=int, default=int(os.environ.get("MAX_SEQ_LEN", "8192")))

Comment thread
polinabinder1 marked this conversation as resolved.
Outdated

def _engine(args):
from .core import Evo2SAE

return Evo2SAE(
evo2_ckpt_dir=args.evo2_ckpt_dir,
sae_ckpt_path=args.sae_ckpt_path,
layer=args.layer,
device=args.device,
max_seq_len=args.max_seq_len,
feature_annotations=args.feature_annotations,
)
Comment thread
polinabinder1 marked this conversation as resolved.
Outdated


def _read_fasta(path: str):
seqs, ids = [], []
name, parts = None, []
opener = gzip.open if str(path).endswith(".gz") else open
with opener(path, "rt") as f:
for line in f:
line = line.rstrip()
if line.startswith(">"):
if name is not None:
seqs.append("".join(parts))
ids.append(name)
name, parts = line[1:].split()[0] if len(line) > 1 else f"seq_{len(ids)}", []
else:
parts.append(line)
if name is not None:
seqs.append("".join(parts))
ids.append(name)
return ids, seqs
Comment thread
polinabinder1 marked this conversation as resolved.
Outdated


def main():
"""Parse args and dispatch to the serve / encode / batch subcommand."""
ap = argparse.ArgumentParser(description="Evo2 SAE inference (serve | encode | batch)")
sub = ap.add_subparsers(dest="cmd", required=True)

ps = sub.add_parser("serve", help="start the FastAPI inference server")
_add_common(ps)
ps.add_argument("--host", default="0.0.0.0")
ps.add_argument("--port", type=int, default=int(os.environ.get("PORT", "8001")))

pe = sub.add_parser("encode", help="annotate ONE sequence -> top features (JSON)")
_add_common(pe)
pe.add_argument("--sequence", required=True)
pe.add_argument("--organism", default="None (raw DNA)")
pe.add_argument("--top-k", type=int, default=8)

pb = sub.add_parser("batch", help="MANY sequences (FASTA) -> parquet of per-sequence top features")
_add_common(pb)
pb.add_argument("--fasta", required=True)
pb.add_argument("--out", required=True)
pb.add_argument("--top-k", type=int, default=16)
pb.add_argument("--batch-size", type=int, default=8)

args = ap.parse_args()

if args.cmd == "serve":
import uvicorn

from .server import build_app

uvicorn.run(build_app(_engine(args)), host=args.host, port=args.port, log_level="info")
return

from .core import clean_dna

eng = _engine(args).load()

if args.cmd == "encode":
tag = eng.resolve_tag(args.organism, None) or ""
dna = clean_dna(args.sequence)
codes = eng.encode(tag + dna)
tag_len = len(tag) if codes.shape[0] >= len(tag) else 0
feats = eng.top_features(codes, tag_len=tag_len, k=args.top_k)
print(
json.dumps(
{"sequence": dna, "organism": args.organism, "bases": len(dna), "top_features": feats}, indent=2
)
)

elif args.cmd == "batch":
import pandas as pd

ids, seqs = _read_fasta(args.fasta)
print(f"[batch] {len(seqs)} sequences from {args.fasta}; encoding (batch_size={args.batch_size})…")
codes_list = eng.encode_batch(seqs, batch_size=args.batch_size)
rows = []
for sid, codes in zip(ids, codes_list):
for rank, ft in enumerate(eng.top_features(codes, k=args.top_k)):
rows.append({"sequence_id": sid, "bp": int(codes.shape[0]), "rank": rank, **ft})
df = pd.DataFrame(rows)
df.to_parquet(args.out, index=False)
print(f"[batch] wrote {len(df)} rows for {len(seqs)} sequences -> {args.out}")


if __name__ == "__main__":
main()
Loading
Loading