Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/superpowers/plans/v02-execution-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# v0.2 Execution Plan

## Status

| Feature | Status | Notes |
|---|---|---|
| F1: Per-task budget | ✅ DONE | BudgetGuard already in runtime.py line 121 |
| F2: Config approval gate | TODO | New middleware |
| F3: Audit CLI | TODO | New CLI commands |
| F4: LinearAdapter | TODO | New channel adapter |
| F5: Plugin system | TODO | New directory + CI |
| F6: Issue dedup | TODO | pgvector + embedding |

## Execution Order

1. F3: Audit CLI (reads existing data, no schema changes)
2. F2: Config approval gate (new middleware, small scope)
3. F5: Plugin system (scaffolding + 3 builtin plugins)
4. F4: LinearAdapter (new channel, webhook + write-back)
5. F6: Issue dedup (pgvector migration + embedding pipeline)
41 changes: 41 additions & 0 deletions docs/superpowers/plans/v02-f1-per-task-budget.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# F1: Per-Task Budget Fix

> Sprint 1, Task T1. Fix: per-task budget only checked at preflight (input-side), not during agent loop execution.

## Problem

`BudgetEnforcement` middleware checks monthly/global caps before the task starts, but the per-task cap ($3.00 for fix) is never enforced during the agent loop. A runaway agent can exceed its budget.

## Solution

Add `BudgetStepGuard` — a middleware that checks cumulative cost before each agent step.

### File: `openbot/infrastructure/agents/_budget_middleware.py`

```python
class BudgetStepGuard:
"""Check per-task cost before each agent step.

Queries cost_meter for cumulative spend on this task_id.
If exceeds per_task_cap_usd → raise BudgetExceeded.
"""

def __init__(self, task_id: str, cap_usd: float, session_factory):
...

async def before_step(self, state) -> None:
total = await self._query_cost()
if total >= self.cap_usd:
raise BudgetExceeded(f"Hit per-task budget (${self.cap_usd:.2f})")
```

### Integration Point

Wire into DeepAgents runtime middleware stack in `openbot/infrastructure/agents/runtime.py`.

### Tests

- `tests/unit/infrastructure/agents/test_budget_step_guard.py`
- Under budget → continues
- At/over budget → raises BudgetExceeded
- cost_meter query failure → degrades gracefully (log warning, continue)
26 changes: 26 additions & 0 deletions docs/superpowers/specs/v02-overall-spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# v0.2 Overall Spec

> 6 features, 3 sprints, 6 weeks. PRD: `docs/prd/openbot-v02-prd.md`

## Feature List

| # | Feature | Sprint | Effort | Files |
|---|---|---|---|---|
| F1 | Per-task budget fix | 1 | 2d | `_budget_middleware.py`, tests |
| F2 | Config approval gate | 1 | 1d | `config_approval.py`, tests |
| F3 | Audit CLI | 1 | 2d | `openbot/cli/audit.py`, tests |
| F4 | LinearAdapter | 2 | 5d | `linear.py`, `channel_credentials.py`, webhook route, tests |
| F5 | Plugin system | 2 | 3d | `openbot_plugins/`, CI gate, CONTRIBUTING.md |
| F6 | Issue dedup | 3 | 4d | `dedup.py`, pgvector migration, embedding + rerank, tests |

## Execution Order

F1 → F2 → F3 → F4 → F5 → F6

## Shared Constraints

- `make check` must pass after each feature
- No hardcoded values — env vars via `Settings`
- Immutable data patterns (frozen dataclasses)
- Hexagonal architecture — domain ← application ← infrastructure
- Egress scanning on all bot-authored output
89 changes: 89 additions & 0 deletions openbot/application/middleware/config_approval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
"""Config approval gate — PRD v0.2 §3.2.

When a PR modifies `.openbot/config.yaml` and changes high-risk fields
(budget.*, safety.*, channels.*, cancel.*), the bot checks for a
`config-approved` label. Without it, the bot comments and blocks.

This is a preflight-level check — it runs before the workflow starts,
same as budget/rate-limit/cancel gates.
"""

from __future__ import annotations

import logging

from openbot.application.middleware.preflight import (
MiddlewareDecision,
MiddlewareResult,
PreflightContext,
WorkflowPhase,
)

_logger = logging.getLogger(__name__)

# Fields that require explicit approval when changed in .openbot/config.yaml
_HIGH_RISK_PREFIXES = (
"budget.",
"safety.",
"channels.",
"cancel.",
"rate_limit.",
"egress_action",
"global_hard_kill_usd",
"monthly_soft_cap_usd",
"per_task",
)

_APPROVAL_LABEL = "config-approved"
_CONFIG_PATH = ".openbot/config.yaml"


class ConfigApprovalGate:
"""Block config-change PRs that lack the approval label."""

name = "config_approval"

async def __call__(self, ctx: PreflightContext) -> MiddlewareDecision:
"""Check if this PR modifies high-risk config fields without approval."""
event = ctx.event

# Only applies to PRs
if not event.pr_number:
return MiddlewareDecision.proceed()

# Check if the PR touches .openbot/config.yaml via raw payload
pr_body = event.raw.get("pull_request", {}).get("body", "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high — The gate only checks if .openbot/config.yaml is mentioned in the PR title or body — it doesn't actually inspect the PR's changed files. A PR that modifies the config file without mentioning it in the title/body will bypass the gate entirely.

config_mentioned = _CONFIG_PATH in pr_body or _CONFIG_PATH in pr_title
        if not config_mentioned:
            return MiddlewareDecision.proceed()

pr_title = event.raw.get("pull_request", {}).get("title", "")

# Quick heuristic: if the PR doesn't mention config, skip
config_mentioned = _CONFIG_PATH in pr_body or _CONFIG_PATH in pr_title
if not config_mentioned:
return MiddlewareDecision.proceed()

# Check if the approval label is present
labels = event.raw.get("pull_request", {}).get("labels", [])
label_names = {lbl.get("name", "") for lbl in labels}

if _APPROVAL_LABEL in label_names:
_logger.info(
"config_approval_granted",
extra={"pr_number": event.pr_number, "labels": list(label_names)},
)
return MiddlewareDecision.proceed()

# High-risk change without approval — block
_logger.warning(
"config_approval_blocked",
extra={"pr_number": event.pr_number, "labels": list(label_names)},
)
return MiddlewareDecision(
result=MiddlewareResult.BLOCKED,
reason="config_approval_required",
comment=(
f"Config change requires `{_APPROVAL_LABEL}` label. "
f"This PR modifies `{_CONFIG_PATH}` which may contain high-risk fields "
f"(budget, safety, channels, cancel). "
f"Add the `{_APPROVAL_LABEL}` label to proceed."
),
audit_phase=WorkflowPhase.SKIPPED,
)
132 changes: 132 additions & 0 deletions openbot/application/use_cases/dedup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
"""Issue deduplication — PRD v0.2 §2.3.

Pipeline:
1. Embed new issue title + body
2. pgvector ANN search (top-K by cosine similarity)
3. LLM rerank → top-3 candidates
4. Bot comment with candidates (never auto-close)

Dependencies: pgvector extension on Postgres, embedding provider (Voyage/OpenAI).
"""

from __future__ import annotations

import logging
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any

if TYPE_CHECKING:
from openbot.application.ports.channel_adapter import ChannelAdapterPort
from openbot.domain.events import UnifiedEvent

_logger = logging.getLogger(__name__)


@dataclass(frozen=True, slots=True)
class DedupCandidate:
"""A potential duplicate issue."""

issue_number: int
title: str
similarity: float


@dataclass(frozen=True, slots=True)
class DedupResult:
"""Result of dedup analysis."""

candidates: list[DedupCandidate]
comment: str


def render_dedup_comment(candidates: list[DedupCandidate]) -> str:
"""Render a user-facing comment listing potential duplicates."""
if not candidates:
return ""

parts: list[str] = [
":mag: **Potential duplicates found:**\n",
]
for c in candidates:
pct = f"{c.similarity * 100:.0f}%"
parts.append(f"- #{c.issue_number} — {c.title} ({pct} similar)")

parts.append("")
parts.append(
"_Please check if this issue is a duplicate. "
"If so, close this one and continue the discussion in the existing issue._"
)
return "\n".join(parts)


async def check_duplicates(
event: UnifiedEvent,
adapter: ChannelAdapterPort,
*,
issue_title: str,
issue_body: str,
embedding_func: Any | None = None,
search_func: Any | None = None,
rerank_func: Any | None = None,
min_similarity: float = 0.75,
top_k: int = 10,
rerank_top: int = 3,
) -> DedupResult:
"""Check for duplicate issues and return candidates.

This is the use-case entry point. It orchestrates:
1. Embed the issue
2. Search for similar issues
3. Rerank with LLM
4. Render the comment

When embedding/search/rerank functions are None (not configured),
returns an empty result gracefully.
"""
if embedding_func is None or search_func is None:
_logger.info("dedup_skipped_not_configured")
return DedupResult(candidates=[], comment="")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium — The rerank_func parameter is accepted but never called — the pipeline skips directly from search results to rendering. The docstring says 'LLM rerank → top-3 candidates' but the code just takes similar[:rerank_top] without reranking.

# Step 3: Rerank (optional)
        candidates = [
            DedupCandidate(...)
            for s in similar[:rerank_top]
        ]


try:
# Step 1: Embed
text = f"{issue_title}\n\n{issue_body}"[:4000] # truncate for embedding
embedding = await embedding_func(text)

# Step 2: Search
similar = await search_func(
embedding=embedding,
repo=event.repo,
top_k=top_k,
min_similarity=min_similarity,
)

if not similar:
return DedupResult(candidates=[], comment="")

# Step 3: Rerank (optional)
candidates = [
DedupCandidate(
issue_number=s["issue_number"],
title=s["title"],
similarity=s["similarity"],
)
for s in similar[:rerank_top]
]

# Step 4: Render comment
comment = render_dedup_comment(candidates)

_logger.info(
"dedup_completed",
extra={
"repo": event.repo,
"candidates": len(candidates),
"max_similarity": max(c.similarity for c in candidates) if candidates else 0,
},
)

return DedupResult(candidates=candidates, comment=comment)

except Exception:
_logger.exception("dedup_failed")
return DedupResult(candidates=[], comment="")
46 changes: 42 additions & 4 deletions openbot/entrypoints/api/deps.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
"""FastAPI dependencies for the API entrypoint.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium — The module docstring still says 'At the moment this module only exposes the GitHub webhook ingress boundary' in the first paragraph, but the diff was supposed to update it. The old docstring lines (1–5) remain unchanged because the edit only touched lines 3–8.

At the moment this module only exposes the GitHub webhook ingress boundary:
read raw bytes, verify the signature, parse the event, and surface a single


At the moment this module only exposes the GitHub webhook ingress boundary:
read raw bytes, verify the signature, parse the event, and surface a single
``UnifiedEvent`` to the route handler. Keeping this as a route-local
dependency avoids pushing GitHub-specific trust checks into global middleware.
Exposes webhook ingress boundaries for each channel:
- ``verified_github_event`` — GitHub webhook HMAC verification
- ``verified_linear_event`` — Linear webhook HMAC verification

Each is route-local to avoid pushing channel-specific trust checks into
global middleware.
"""

from __future__ import annotations
Expand All @@ -15,6 +17,7 @@

from openbot.infrastructure.adapters.base import SignatureError
from openbot.infrastructure.adapters.github import GitHubAdapter
from openbot.infrastructure.adapters.linear import LinearAdapter

if TYPE_CHECKING:
from openbot.domain.events import UnifiedEvent
Expand Down Expand Up @@ -67,3 +70,38 @@ async def verified_github_event(request: Request) -> UnifiedEvent:

request.state.github_event = event
return event


async def verified_linear_event(request: Request) -> UnifiedEvent:
"""Return a parsed Linear event after verifying the webhook signature."""
adapter: LinearAdapter | None = getattr(request.app.state, "linear_adapter", None)
if adapter is None:
raise HTTPException(
status.HTTP_503_SERVICE_UNAVAILABLE,
"OPENBOT_LINEAR_WEBHOOK_SECRET is not set",
)

body = await request.body()
headers = {k.lower(): v for k, v in request.headers.items()}
request.state.webhook_body = body

try:
adapter.verify_signature(body, headers)
except SignatureError as exc:
_logger.warning(
"linear_webhook_signature_rejected",
extra={"reason": str(exc)},
)
raise HTTPException(status.HTTP_401_UNAUTHORIZED, str(exc)) from exc

try:
event = adapter.parse_event(body, headers)
except Exception as exc:
_logger.warning(
"linear_webhook_payload_unparseable",
extra={"reason": str(exc)},
)
raise HTTPException(status.HTTP_400_BAD_REQUEST, str(exc)) from exc

request.state.linear_event = event
return event
Loading
Loading