-
Notifications
You must be signed in to change notification settings - Fork 0
feat(v0.2): LinearAdapter, plugins, dedup, audit CLI, ConfigApprovalGate #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f990744
382d4c0
2729770
bbbce2e
bca5206
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # v0.2 Execution Plan | ||
|
|
||
| ## Status | ||
|
|
||
| | Feature | Status | Notes | | ||
| |---|---|---| | ||
| | F1: Per-task budget | ✅ DONE | BudgetGuard already in runtime.py line 121 | | ||
| | F2: Config approval gate | TODO | New middleware | | ||
| | F3: Audit CLI | TODO | New CLI commands | | ||
| | F4: LinearAdapter | TODO | New channel adapter | | ||
| | F5: Plugin system | TODO | New directory + CI | | ||
| | F6: Issue dedup | TODO | pgvector + embedding | | ||
|
|
||
| ## Execution Order | ||
|
|
||
| 1. F3: Audit CLI (reads existing data, no schema changes) | ||
| 2. F2: Config approval gate (new middleware, small scope) | ||
| 3. F5: Plugin system (scaffolding + 3 builtin plugins) | ||
| 4. F4: LinearAdapter (new channel, webhook + write-back) | ||
| 5. F6: Issue dedup (pgvector migration + embedding pipeline) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| # F1: Per-Task Budget Fix | ||
|
|
||
| > Sprint 1, Task T1. Fix: per-task budget only checked at preflight (input-side), not during agent loop execution. | ||
|
|
||
| ## Problem | ||
|
|
||
| `BudgetEnforcement` middleware checks monthly/global caps before the task starts, but the per-task cap ($3.00 for fix) is never enforced during the agent loop. A runaway agent can exceed its budget. | ||
|
|
||
| ## Solution | ||
|
|
||
| Add `BudgetStepGuard` — a middleware that checks cumulative cost before each agent step. | ||
|
|
||
| ### File: `openbot/infrastructure/agents/_budget_middleware.py` | ||
|
|
||
| ```python | ||
| class BudgetStepGuard: | ||
| """Check per-task cost before each agent step. | ||
|
|
||
| Queries cost_meter for cumulative spend on this task_id. | ||
| If exceeds per_task_cap_usd → raise BudgetExceeded. | ||
| """ | ||
|
|
||
| def __init__(self, task_id: str, cap_usd: float, session_factory): | ||
| ... | ||
|
|
||
| async def before_step(self, state) -> None: | ||
| total = await self._query_cost() | ||
| if total >= self.cap_usd: | ||
| raise BudgetExceeded(f"Hit per-task budget (${self.cap_usd:.2f})") | ||
| ``` | ||
|
|
||
| ### Integration Point | ||
|
|
||
| Wire into DeepAgents runtime middleware stack in `openbot/infrastructure/agents/runtime.py`. | ||
|
|
||
| ### Tests | ||
|
|
||
| - `tests/unit/infrastructure/agents/test_budget_step_guard.py` | ||
| - Under budget → continues | ||
| - At/over budget → raises BudgetExceeded | ||
| - cost_meter query failure → degrades gracefully (log warning, continue) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # v0.2 Overall Spec | ||
|
|
||
| > 6 features, 3 sprints, 6 weeks. PRD: `docs/prd/openbot-v02-prd.md` | ||
|
|
||
| ## Feature List | ||
|
|
||
| | # | Feature | Sprint | Effort | Files | | ||
| |---|---|---|---|---| | ||
| | F1 | Per-task budget fix | 1 | 2d | `_budget_middleware.py`, tests | | ||
| | F2 | Config approval gate | 1 | 1d | `config_approval.py`, tests | | ||
| | F3 | Audit CLI | 1 | 2d | `openbot/cli/audit.py`, tests | | ||
| | F4 | LinearAdapter | 2 | 5d | `linear.py`, `channel_credentials.py`, webhook route, tests | | ||
| | F5 | Plugin system | 2 | 3d | `openbot_plugins/`, CI gate, CONTRIBUTING.md | | ||
| | F6 | Issue dedup | 3 | 4d | `dedup.py`, pgvector migration, embedding + rerank, tests | | ||
|
|
||
| ## Execution Order | ||
|
|
||
| F1 → F2 → F3 → F4 → F5 → F6 | ||
|
|
||
| ## Shared Constraints | ||
|
|
||
| - `make check` must pass after each feature | ||
| - No hardcoded values — env vars via `Settings` | ||
| - Immutable data patterns (frozen dataclasses) | ||
| - Hexagonal architecture — domain ← application ← infrastructure | ||
| - Egress scanning on all bot-authored output |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| """Config approval gate — PRD v0.2 §3.2. | ||
|
|
||
| When a PR modifies `.openbot/config.yaml` and changes high-risk fields | ||
| (budget.*, safety.*, channels.*, cancel.*), the bot checks for a | ||
| `config-approved` label. Without it, the bot comments and blocks. | ||
|
|
||
| This is a preflight-level check — it runs before the workflow starts, | ||
| same as budget/rate-limit/cancel gates. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
|
|
||
| from openbot.application.middleware.preflight import ( | ||
| MiddlewareDecision, | ||
| MiddlewareResult, | ||
| PreflightContext, | ||
| WorkflowPhase, | ||
| ) | ||
|
|
||
| _logger = logging.getLogger(__name__) | ||
|
|
||
| # Fields that require explicit approval when changed in .openbot/config.yaml | ||
| _HIGH_RISK_PREFIXES = ( | ||
| "budget.", | ||
| "safety.", | ||
| "channels.", | ||
| "cancel.", | ||
| "rate_limit.", | ||
| "egress_action", | ||
| "global_hard_kill_usd", | ||
| "monthly_soft_cap_usd", | ||
| "per_task", | ||
| ) | ||
|
|
||
| _APPROVAL_LABEL = "config-approved" | ||
| _CONFIG_PATH = ".openbot/config.yaml" | ||
|
|
||
|
|
||
| class ConfigApprovalGate: | ||
| """Block config-change PRs that lack the approval label.""" | ||
|
|
||
| name = "config_approval" | ||
|
|
||
| async def __call__(self, ctx: PreflightContext) -> MiddlewareDecision: | ||
| """Check if this PR modifies high-risk config fields without approval.""" | ||
| event = ctx.event | ||
|
|
||
| # Only applies to PRs | ||
| if not event.pr_number: | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| # Check if the PR touches .openbot/config.yaml via raw payload | ||
| pr_body = event.raw.get("pull_request", {}).get("body", "") | ||
| pr_title = event.raw.get("pull_request", {}).get("title", "") | ||
|
|
||
| # Quick heuristic: if the PR doesn't mention config, skip | ||
| config_mentioned = _CONFIG_PATH in pr_body or _CONFIG_PATH in pr_title | ||
| if not config_mentioned: | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| # Check if the approval label is present | ||
| labels = event.raw.get("pull_request", {}).get("labels", []) | ||
| label_names = {lbl.get("name", "") for lbl in labels} | ||
|
|
||
| if _APPROVAL_LABEL in label_names: | ||
| _logger.info( | ||
| "config_approval_granted", | ||
| extra={"pr_number": event.pr_number, "labels": list(label_names)}, | ||
| ) | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| # High-risk change without approval — block | ||
| _logger.warning( | ||
| "config_approval_blocked", | ||
| extra={"pr_number": event.pr_number, "labels": list(label_names)}, | ||
| ) | ||
| return MiddlewareDecision( | ||
| result=MiddlewareResult.BLOCKED, | ||
| reason="config_approval_required", | ||
| comment=( | ||
| f"Config change requires `{_APPROVAL_LABEL}` label. " | ||
| f"This PR modifies `{_CONFIG_PATH}` which may contain high-risk fields " | ||
| f"(budget, safety, channels, cancel). " | ||
| f"Add the `{_APPROVAL_LABEL}` label to proceed." | ||
| ), | ||
| audit_phase=WorkflowPhase.SKIPPED, | ||
| ) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| """Issue deduplication — PRD v0.2 §2.3. | ||
|
|
||
| Pipeline: | ||
| 1. Embed new issue title + body | ||
| 2. pgvector ANN search (top-K by cosine similarity) | ||
| 3. LLM rerank → top-3 candidates | ||
| 4. Bot comment with candidates (never auto-close) | ||
|
|
||
| Dependencies: pgvector extension on Postgres, embedding provider (Voyage/OpenAI). | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| from dataclasses import dataclass | ||
| from typing import TYPE_CHECKING, Any | ||
|
|
||
| if TYPE_CHECKING: | ||
| from openbot.application.ports.channel_adapter import ChannelAdapterPort | ||
| from openbot.domain.events import UnifiedEvent | ||
|
|
||
| _logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| @dataclass(frozen=True, slots=True) | ||
| class DedupCandidate: | ||
| """A potential duplicate issue.""" | ||
|
|
||
| issue_number: int | ||
| title: str | ||
| similarity: float | ||
|
|
||
|
|
||
| @dataclass(frozen=True, slots=True) | ||
| class DedupResult: | ||
| """Result of dedup analysis.""" | ||
|
|
||
| candidates: list[DedupCandidate] | ||
| comment: str | ||
|
|
||
|
|
||
| def render_dedup_comment(candidates: list[DedupCandidate]) -> str: | ||
| """Render a user-facing comment listing potential duplicates.""" | ||
| if not candidates: | ||
| return "" | ||
|
|
||
| parts: list[str] = [ | ||
| ":mag: **Potential duplicates found:**\n", | ||
| ] | ||
| for c in candidates: | ||
| pct = f"{c.similarity * 100:.0f}%" | ||
| parts.append(f"- #{c.issue_number} — {c.title} ({pct} similar)") | ||
|
|
||
| parts.append("") | ||
| parts.append( | ||
| "_Please check if this issue is a duplicate. " | ||
| "If so, close this one and continue the discussion in the existing issue._" | ||
| ) | ||
| return "\n".join(parts) | ||
|
|
||
|
|
||
| async def check_duplicates( | ||
| event: UnifiedEvent, | ||
| adapter: ChannelAdapterPort, | ||
| *, | ||
| issue_title: str, | ||
| issue_body: str, | ||
| embedding_func: Any | None = None, | ||
| search_func: Any | None = None, | ||
| rerank_func: Any | None = None, | ||
| min_similarity: float = 0.75, | ||
| top_k: int = 10, | ||
| rerank_top: int = 3, | ||
| ) -> DedupResult: | ||
| """Check for duplicate issues and return candidates. | ||
|
|
||
| This is the use-case entry point. It orchestrates: | ||
| 1. Embed the issue | ||
| 2. Search for similar issues | ||
| 3. Rerank with LLM | ||
| 4. Render the comment | ||
|
|
||
| When embedding/search/rerank functions are None (not configured), | ||
| returns an empty result gracefully. | ||
| """ | ||
| if embedding_func is None or search_func is None: | ||
| _logger.info("dedup_skipped_not_configured") | ||
| return DedupResult(candidates=[], comment="") | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. medium — The |
||
|
|
||
| try: | ||
| # Step 1: Embed | ||
| text = f"{issue_title}\n\n{issue_body}"[:4000] # truncate for embedding | ||
| embedding = await embedding_func(text) | ||
|
|
||
| # Step 2: Search | ||
| similar = await search_func( | ||
| embedding=embedding, | ||
| repo=event.repo, | ||
| top_k=top_k, | ||
| min_similarity=min_similarity, | ||
| ) | ||
|
|
||
| if not similar: | ||
| return DedupResult(candidates=[], comment="") | ||
|
|
||
| # Step 3: Rerank (optional) | ||
| candidates = [ | ||
| DedupCandidate( | ||
| issue_number=s["issue_number"], | ||
| title=s["title"], | ||
| similarity=s["similarity"], | ||
| ) | ||
| for s in similar[:rerank_top] | ||
| ] | ||
|
|
||
| # Step 4: Render comment | ||
| comment = render_dedup_comment(candidates) | ||
|
|
||
| _logger.info( | ||
| "dedup_completed", | ||
| extra={ | ||
| "repo": event.repo, | ||
| "candidates": len(candidates), | ||
| "max_similarity": max(c.similarity for c in candidates) if candidates else 0, | ||
| }, | ||
| ) | ||
|
|
||
| return DedupResult(candidates=candidates, comment=comment) | ||
|
|
||
| except Exception: | ||
| _logger.exception("dedup_failed") | ||
| return DedupResult(candidates=[], comment="") | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,11 @@ | ||
| """FastAPI dependencies for the API entrypoint. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. medium — The module docstring still says 'At the moment this module only exposes the GitHub webhook ingress boundary' in the first paragraph, but the diff was supposed to update it. The old docstring lines (1–5) remain unchanged because the edit only touched lines 3–8. |
||
|
|
||
| At the moment this module only exposes the GitHub webhook ingress boundary: | ||
| read raw bytes, verify the signature, parse the event, and surface a single | ||
| ``UnifiedEvent`` to the route handler. Keeping this as a route-local | ||
| dependency avoids pushing GitHub-specific trust checks into global middleware. | ||
| Exposes webhook ingress boundaries for each channel: | ||
| - ``verified_github_event`` — GitHub webhook HMAC verification | ||
| - ``verified_linear_event`` — Linear webhook HMAC verification | ||
|
|
||
| Each is route-local to avoid pushing channel-specific trust checks into | ||
| global middleware. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
@@ -15,6 +17,7 @@ | |
|
|
||
| from openbot.infrastructure.adapters.base import SignatureError | ||
| from openbot.infrastructure.adapters.github import GitHubAdapter | ||
| from openbot.infrastructure.adapters.linear import LinearAdapter | ||
|
|
||
| if TYPE_CHECKING: | ||
| from openbot.domain.events import UnifiedEvent | ||
|
|
@@ -67,3 +70,38 @@ async def verified_github_event(request: Request) -> UnifiedEvent: | |
|
|
||
| request.state.github_event = event | ||
| return event | ||
|
|
||
|
|
||
| async def verified_linear_event(request: Request) -> UnifiedEvent: | ||
| """Return a parsed Linear event after verifying the webhook signature.""" | ||
| adapter: LinearAdapter | None = getattr(request.app.state, "linear_adapter", None) | ||
| if adapter is None: | ||
| raise HTTPException( | ||
| status.HTTP_503_SERVICE_UNAVAILABLE, | ||
| "OPENBOT_LINEAR_WEBHOOK_SECRET is not set", | ||
| ) | ||
|
|
||
| body = await request.body() | ||
| headers = {k.lower(): v for k, v in request.headers.items()} | ||
| request.state.webhook_body = body | ||
|
|
||
| try: | ||
| adapter.verify_signature(body, headers) | ||
| except SignatureError as exc: | ||
| _logger.warning( | ||
| "linear_webhook_signature_rejected", | ||
| extra={"reason": str(exc)}, | ||
| ) | ||
| raise HTTPException(status.HTTP_401_UNAUTHORIZED, str(exc)) from exc | ||
|
|
||
| try: | ||
| event = adapter.parse_event(body, headers) | ||
| except Exception as exc: | ||
| _logger.warning( | ||
| "linear_webhook_payload_unparseable", | ||
| extra={"reason": str(exc)}, | ||
| ) | ||
| raise HTTPException(status.HTTP_400_BAD_REQUEST, str(exc)) from exc | ||
|
|
||
| request.state.linear_event = event | ||
| return event | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
high — The gate only checks if
.openbot/config.yamlis mentioned in the PR title or body — it doesn't actually inspect the PR's changed files. A PR that modifies the config file without mentioning it in the title/body will bypass the gate entirely.