Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
de76536
chore(reviewer-rigor): BP audit + release SKILL YAML fix + version 4.2.0
claude May 28, 2026
ae2bdda
feat(schema): add call_tree category + CALL- id prefix for reviewer c…
claude May 28, 2026
22bdcae
Merge branch 'feat/reviewer-rigor-housekeeping' into feat/reviewer-rigor
claude May 28, 2026
e6975ee
feat(reviewer-rigor): call-tree inspection methodology + ephemeral-ID…
claude May 28, 2026
b76bf72
fix(reviewer-rigor): address Copilot review on PR #41
claude May 29, 2026
f10308d
feat(git-and-github): PR bodies lead with "Why this PR exists" rationale
claude May 29, 2026
ead92c6
fix(report-pipeline): derive severity on-the-fly + schema 3.1.0 addit…
claude May 29, 2026
a631c98
fix(report-pipeline): renderer on-the-fly severity + permalink @{u} +…
claude May 29, 2026
aa89655
style(consolidate): drop unused build_severity_stats import (ruff F401)
claude May 29, 2026
c9f0ccc
feat(review-pr): Pass C v1.1 doc heuristics + regression tests (4.5.0)
claude May 29, 2026
d7c93c8
Merge branch 'main' into feat/report-pipeline-severity
claude Jun 3, 2026
c107c11
fix(report-pipeline): address Copilot review on PR #42
claude Jun 3, 2026
ec9ad84
fix(report-pipeline): address second Copilot pass on PR #42
claude Jun 3, 2026
e33231d
Merge branch 'feat/report-pipeline-severity' into feat/pr-why-template
claude Jun 3, 2026
67c1ce7
Merge remote-tracking branch 'origin/main' into feat/pr-why-template
claude Jun 3, 2026
e24a823
style(test): black-format test_pr_body_template.py
claude Jun 3, 2026
81e54fc
Merge branch 'feat/pr-why-template' into feat/passc-v1.1
claude Jun 3, 2026
2cf5ce8
style(test): ruff E741 + black on test_review_pr_passc.py
claude Jun 3, 2026
5447c29
Merge remote-tracking branch 'origin/main' into feat/passc-v1.1
claude Jun 3, 2026
b941251
fix(review-pr): correct Pass C informational-finding severity floats …
claude Jun 3, 2026
f4dfd62
docs(changelog): correct PR-body-unparseable finding band INFO -> LOW
claude Jun 3, 2026
a9cfcf3
docs(review-pr): correct band-math wording in Pass C scope note + tes…
claude Jun 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "claudius",
"version": "4.1.7",
"version": "4.5.0",
"description": "Collection of specialized development agents and skills for Claude Code",
"author": {
"name": "lklimek",
Expand Down
41 changes: 41 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,47 @@ Format follows [Keep a Changelog](https://keepachangelog.com/). This project use

## [Unreleased]

## [4.5.0] - 2026-05-29

### Changed

- review-pr Pass C v1.1: compound PR titles are split on commas/em-dashes and each topic verified independently with a majority-hits rule; the "undocumented change" trigger keeps its ≥50-LOC threshold but now defines "mentioned" precisely (keyword overlap with ≥1 Summary bullet OR a field-ownership-table row); Summary-heading precedence is fixed as `## Summary` > `### Summary` > `## What changed` (first match wins, bullet-list fallback only when none match); Pass C may optionally set `finding_section.verdict` on its `pr_promises` section (PASS/FAIL/NEEDS_REVIEW) and `metadata.report_type: "pr_audit"` on the envelope.

### Fixed

- review-pr Pass C body extraction: a PR body wholly wrapped in a single code fence is now unwrapped and dedented before the column-0-anchored Summary/Out-of-scope regexes run, instead of silently matching nothing; if no Summary header and no top-level bullet list survive, Pass C emits one low-confidence INFO "PR body unparseable" finding rather than skipping silently.
Comment thread
lklimek marked this conversation as resolved.
Outdated
- review-pr Pass C clean-pass output: a fully-clean Pass C now emits `findings: []` plus one INFO "PR self-description verified" finding, making a clean pass distinguishable from "Pass C did not run".
- review-pr Pass C code_snippets `language`: cross-references `claudius:report-format` §code_snippets for allowed `language` values instead of hard-coding `"diff"`.

## [4.3.0] - 2026-05-29

### Added

- Schema 3.1.0 (additive over 3.0.0; both validate): `metadata.report_type: pr_audit`, optional finding `author_type` (`bot`/`human`), optional `finding_section.verdict` (`PASS`/`FAIL`/`NEEDS_REVIEW`).
- Shared `scripts/severity_util.py` — single source of truth for the OWASP severity band table, per-finding severity derivation, and the summary-statistics / category-matrix builder, imported by both the coordinator (`consolidate_reports.py`) and the renderer (`generate_review_report.py`).

### Fixed

- check-pr-comments findings rendered INFO with zero severity counts — the renderer now derives per-finding severity from `risk`/`impact`/`scope` and recomputes summary statistics on-the-fly when absent or all-zero, so standalone producer reports show real severities and counts across Markdown / HTML / triage / PDF. A non-zero supplied `severity_counts` is never overwritten.
- Permalink commit now derives from `git rev-parse @{u}` (falling back to local `HEAD` only when the branch has no upstream) in `check-pr-comments`, `report-format`, and `grumpy-review`, so generated permalinks resolve on GitHub instead of 404-ing on an unpushed HEAD.

## [4.2.0] - 2026-05-28

### Added

- **Reviewer call-tree inspection**: `grumpy-review`, `review-pr`, and `check-pr-comments` now perform a deep transitive in-repo caller walk for every function modified by the diff. Reviewer probes the environment for the deepest analysis tool available (ctags, GNU global, ripgrep, tree-sitter, any installed LSP) and falls back to grep-based caller extraction. Findings emit in new `call_tree` category with `CALL-` ID prefix. Walk is capped at depth 5, 200 callers, 60s per function; reviewer ranks modified functions by risk (public API > private; trait/interface impl > leaf; signature-changed > body-only) and walks the top 10 when a PR touches many.
- **Ephemeral-ID coding convention** in `skills/coding-best-practices/SKILL.md`: source code, comments, and committed docs MUST NOT reference transient review-finding IDs (e.g. `CMT-001`, `SEC-014`, `RUST-123`, `CALL-005`). Allow-list documented for permanent IDs (`ADR-NNN`, `RFC-NNN`, `CWE-NNN`, `CVE-YYYY-NNNN`, `OWASP-*`, GHSA, GitHub issue/PR refs, `TODO`/`FIXME`, committed test-spec IDs). Enforced two ways: Bilby (write-side, preloads BP) and `grumpy-review`/`review-pr` (review-side, via new `scripts/lint_ephemeral_ids.py`).
- **Schema**: `schemas/review-report.schema.json` extended with `call_tree` category and `CALL-` ID pattern (additive, no schema version bump).
- **Regression guards**: `tests/test_skill_frontmatter.py` parses every skill/agent frontmatter via PyYAML; `tests/test_bp_load_audit.py` asserts curated agent set loads `coding-best-practices`.

### Changed

- `coding-best-practices` skill is now loaded by every reviewer/coder agent (`architect-nagatha`, `claudius`, `developer-bilby` (already), `project-reviewer-adams` (already), `qa-engineer-marvin`, `security-engineer-smythe`, `technical-writer-trillian`, `ux-designer-diziet`) via YAML `skills:` frontmatter. Orchestrator skills that spawn coding/reviewing agents now explicitly require spawned agents preload BP (`grumpy-review` §3, `check-pr-comments` §3).

### Fixed

- `skills/release/SKILL.md`: YAML frontmatter `description:` value quoted to fix PyYAML parse error on unquoted colon. `claude plugin validate .` now exits 0 (previously reported this one error).

## [4.1.7] - 2026-05-27

### Changed
Expand Down
2 changes: 1 addition & 1 deletion agents/architect-nagatha.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: architect-nagatha
description: "Use for system design, module boundaries, dependency review, architectural trade-offs, technology evaluation, library comparison, or validating plans before presenting to user."
tools: ["Read", "Grep", "Glob", "Bash", "WebSearch", "WebFetch", "mcp__plugin_memcan_brain__search", "mcp__plugin_memcan_brain__search_memories", "mcp__plugin_memcan_brain__search_code", "mcp__plugin_memcan_brain__search_standards", "mcp__plugin_memcan_brain__add_memory", "mcp__plugin_claudius_github__get_file_contents", "mcp__plugin_claudius_github__search_repositories", "mcp__plugin_claudius_github__search_code", "mcp__plugin_claudius_github__pull_request_read", "mcp__plugin_claudius_github__list_pull_requests", "mcp__plugin_claudius_github__get_latest_release", "mcp__plugin_claudius_github__list_releases", "mcp__plugin_claudius_github__get_discussion", "mcp__plugin_claudius_github__get_discussion_comments"]
skills: ["security-best-practices", "rust-best-practices"]
skills: ["coding-best-practices", "security-best-practices", "rust-best-practices"]
model: opus
mcpServers: ["plugin_memcan_brain", "github"]
---
Expand Down
2 changes: 1 addition & 1 deletion agents/claudius.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: claudius
description: "Personal software development assistant. Leads and coordinates development efforts. Always invoked when user interaction is needed."
skills: ["git-and-github", "severity", "grand-admiral"]
skills: ["coding-best-practices", "git-and-github", "severity", "grand-admiral"]
memory: [user, project, local]
model: opus[1m]
mcpServers: ["plugin_memcan_brain", "github"]
Expand Down
2 changes: 1 addition & 1 deletion agents/qa-engineer-marvin.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: qa-engineer-marvin
description: "Use to validate that code matches requirements. Audits test coverage against specs, executes tests, and reports all mismatches."
tools: ["Read", "Write", "Edit", "Grep", "Glob", "Bash", "Task", "mcp__plugin_memcan_brain__search", "mcp__plugin_memcan_brain__search_memories", "mcp__plugin_memcan_brain__search_code", "mcp__plugin_memcan_brain__search_standards", "mcp__plugin_memcan_brain__add_memory", "mcp__plugin_claudius_github__pull_request_read", "mcp__plugin_claudius_github__list_pull_requests", "mcp__plugin_claudius_github__issue_read", "mcp__plugin_claudius_github__list_issues", "mcp__plugin_claudius_github__search_issues", "mcp__plugin_claudius_github__actions_list", "mcp__plugin_claudius_github__actions_get", "mcp__plugin_claudius_github__get_job_logs"]
model: opus
skills: ["security-best-practices", "severity", "report-format"]
skills: ["coding-best-practices", "security-best-practices", "severity", "report-format"]
mcpServers: ["plugin_memcan_brain", "github"]
---

Expand Down
2 changes: 1 addition & 1 deletion agents/security-engineer-smythe.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: security-engineer-smythe
description: "Use for security audits, auth/crypto/input validation reviews, dependency scanning, secret detection, or validating plans before presenting to user."
tools: ["Read", "Write", "Grep", "Glob", "Bash", "WebSearch", "WebFetch", "Task", "mcp__plugin_memcan_brain__search", "mcp__plugin_memcan_brain__search_memories", "mcp__plugin_memcan_brain__search_code", "mcp__plugin_memcan_brain__search_standards", "mcp__plugin_memcan_brain__add_memory", "mcp__plugin_claudius_github__list_code_scanning_alerts", "mcp__plugin_claudius_github__get_code_scanning_alert", "mcp__plugin_claudius_github__list_dependabot_alerts", "mcp__plugin_claudius_github__get_dependabot_alert", "mcp__plugin_claudius_github__list_secret_scanning_alerts", "mcp__plugin_claudius_github__get_secret_scanning_alert", "mcp__plugin_claudius_github__list_repository_security_advisories", "mcp__plugin_claudius_github__list_org_repository_security_advisories", "mcp__plugin_claudius_github__list_global_security_advisories", "mcp__plugin_claudius_github__get_global_security_advisory", "mcp__plugin_claudius_github__pull_request_read", "mcp__plugin_claudius_github__list_pull_requests", "mcp__plugin_claudius_github__search_code", "mcp__plugin_claudius_github__search_repositories", "mcp__plugin_claudius_github__get_file_contents", "mcp__plugin_claudius_github__get_commit", "mcp__plugin_claudius_github__list_commits"]
skills: ["security-best-practices", "severity", "report-format"]
skills: ["coding-best-practices", "security-best-practices", "severity", "report-format"]
model: opus
mcpServers: ["plugin_memcan_brain", "github"]
---
Expand Down
2 changes: 1 addition & 1 deletion agents/technical-writer-trillian.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: technical-writer-trillian
description: "Use for creating, maintaining, or reviewing documentation — READMEs, API docs, tutorials, guides, changelogs, ADRs."
tools: ["Read", "Write", "Edit", "Grep", "Glob", "Bash", "mcp__plugin_memcan_brain__search", "mcp__plugin_memcan_brain__search_memories", "mcp__plugin_memcan_brain__search_code", "mcp__plugin_memcan_brain__search_standards", "mcp__plugin_memcan_brain__add_memory", "mcp__plugin_claudius_github__pull_request_read", "mcp__plugin_claudius_github__list_pull_requests", "mcp__plugin_claudius_github__issue_read", "mcp__plugin_claudius_github__list_issues", "mcp__plugin_claudius_github__list_releases", "mcp__plugin_claudius_github__get_latest_release", "mcp__plugin_claudius_github__get_release_by_tag", "mcp__plugin_claudius_github__list_tags", "mcp__plugin_claudius_github__get_file_contents", "mcp__plugin_claudius_github__get_discussion", "mcp__plugin_claudius_github__get_discussion_comments"]
skills: ["report-format"]
skills: ["coding-best-practices", "report-format"]
model: sonnet
mcpServers: ["plugin_memcan_brain", "github"]
---
Expand Down
2 changes: 1 addition & 1 deletion agents/ux-designer-diziet.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: ux-designer-diziet
description: "Use at project start for requirements, domain analysis, stakeholder mapping, or during design for UI flows, interaction patterns, usability, accessibility, and validating plans before presenting to user."
tools: ["Read", "Write", "Edit", "Grep", "Glob", "WebSearch", "WebFetch", "mcp__plugin_memcan_brain__search", "mcp__plugin_memcan_brain__search_memories", "mcp__plugin_memcan_brain__search_code", "mcp__plugin_memcan_brain__search_standards", "mcp__plugin_memcan_brain__add_memory", "mcp__plugin_claudius_github__pull_request_read", "mcp__plugin_claudius_github__list_pull_requests", "mcp__plugin_claudius_github__issue_read", "mcp__plugin_claudius_github__list_issues", "mcp__plugin_claudius_github__search_issues", "mcp__plugin_claudius_github__list_issue_types", "mcp__plugin_claudius_github__get_label", "mcp__plugin_claudius_github__get_discussion", "mcp__plugin_claudius_github__get_discussion_comments", "mcp__plugin_claudius_github__list_discussions", "mcp__plugin_claudius_github__list_discussion_categories"]
skills: []
skills: ["coding-best-practices"]
model: opus
memory: user
mcpServers: ["plugin_memcan_brain", "github"]
Expand Down
25 changes: 18 additions & 7 deletions schemas/review-report.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"properties": {
"schema_version": {
"type": "string",
"enum": ["3.0.0"],
"enum": ["3.0.0", "3.1.0"],
"description": "Schema version (SemVer). Renderers check this for compatibility."
},
"metadata": {
Expand Down Expand Up @@ -43,7 +43,7 @@
},
"report_type": {
"type": "string",
"enum": ["code_review", "comment_check"],
"enum": ["code_review", "comment_check", "pr_audit"],
"description": "Type of report. Defaults to code_review if omitted."
},
"pr_number": {
Expand Down Expand Up @@ -95,6 +95,7 @@
"security": { "type": "integer", "minimum": 0 },
"project": { "type": "integer", "minimum": 0 },
"code_quality": { "type": "integer", "minimum": 0 },
"call_tree": { "type": "integer", "minimum": 0 },
"dependencies": { "type": "integer", "minimum": 0 },
"documentation": { "type": "integer", "minimum": 0 },
"pr_comments": { "type": "integer", "minimum": 0 },
Expand Down Expand Up @@ -211,7 +212,7 @@
"type": "integer",
"minimum": 1,
"maximum": 5,
"description": "Numeric severity: 5=CRITICAL, 4=HIGH, 3=MEDIUM, 2=LOW, 1=INFO. Coordinator-derived from overall_severity via the OWASP band table."
"description": "Numeric severity: 5=CRITICAL, 4=HIGH, 3=MEDIUM, 2=LOW, 1=INFO. Coordinator-derived from overall_severity via the OWASP band table. Optional on producer/standalone reports; renderers derive it on-the-fly from risk/impact/scope when absent."
},
"severity_float": {
"type": "number",
Expand All @@ -226,14 +227,19 @@
"properties": {
"id": {
"type": "string",
"pattern": "^(SEC|QA|PROJ|CODE|RUST|PY|GO|FE|DOC|CMT|DEP|PPM)-\\d{3}$",
"description": "Finding ID with category prefix (SEC→security, PROJ→project, CODE/RUST/PY/GO/FE→code_quality, DOC→documentation, CMT→pr_comments, DEP→dependencies, PPM→pr_promises)"
"pattern": "^(SEC|QA|PROJ|CODE|RUST|PY|GO|FE|DOC|CMT|DEP|PPM|CALL)-\\d{3}$",
"description": "Finding ID with category prefix (SEC→security, PROJ→project, CODE/RUST/PY/GO/FE→code_quality, DOC→documentation, CMT→pr_comments, DEP→dependencies, PPM→pr_promises, CALL→call_tree)"
},
"severity": { "$ref": "#/$defs/severity" },
"risk": { "$ref": "#/$defs/severity_float", "description": "OWASP Likelihood normalized to [0,1]." },
"impact": { "$ref": "#/$defs/severity_float", "description": "OWASP Impact normalized to [0,1]." },
"scope": { "$ref": "#/$defs/severity_float", "description": "PR relevance: 1.0 in-diff, 0.5 indirectly affected, 0.0 pre-existing/unrelated." },
"overall_severity": { "$ref": "#/$defs/severity_float", "description": "Coordinator-computed mean of risk, impact, scope." },
"author_type": {
"type": "string",
"enum": ["bot", "human"],
"description": "Comment author classification (check-pr-comments)."
},
"title": { "type": "string", "minLength": 1 },
"tags": {
"type": "array",
Expand Down Expand Up @@ -292,13 +298,18 @@
"subtitle": { "type": "string" },
"category": {
"type": "string",
"enum": ["security", "project", "code_quality", "dependencies", "documentation", "pr_comments", "pr_promises"]
"enum": ["security", "project", "code_quality", "call_tree", "dependencies", "documentation", "pr_comments", "pr_promises"]
},
"findings": {
"type": "array",
"items": { "$ref": "#/$defs/finding" }
},
"positives": { "type": "string", "description": "Positive observations for this section" }
"positives": { "type": "string", "description": "Positive observations for this section" },
"verdict": {
"type": "string",
"enum": ["PASS", "FAIL", "NEEDS_REVIEW"],
"description": "Section-level audit verdict (review-pr Pass C)."
}
}
},
"triage_decision": {
Expand Down
61 changes: 22 additions & 39 deletions scripts/consolidate_reports.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,13 @@
from typing import Any
from urllib.parse import quote as _url_quote

from severity_util import (
SEV_LABELS,
SEV_ORDER,
derive_overall,
derive_severity_int,
)

try:
import jsonschema

Expand All @@ -56,19 +63,11 @@
Path(__file__).resolve().parent.parent / "schemas" / "review-report.schema.json"
)

SEV_LABELS: dict[int, str] = {
5: "CRITICAL",
4: "HIGH",
3: "MEDIUM",
2: "LOW",
1: "INFO",
}
SEV_ORDER: list[str] = list(SEV_LABELS.values()) # CRITICAL, HIGH, ... INFO

CATEGORY_PREFIX: dict[str, str] = {
"security": "SEC-",
"project": "PROJ-",
"code_quality": "CODE-",
"call_tree": "CALL-",
"documentation": "DOC-",
"pr_comments": "CMT-",
"pr_promises": "PPM-",
Expand Down Expand Up @@ -134,6 +133,9 @@ def _read_schema_version() -> str:

SCHEMA_VERSION = _read_schema_version()

# Both the current minor and its predecessor validate (additive 3.0 -> 3.1).
ACCEPTED_SCHEMA_VERSIONS = {"3.0.0", "3.1.0"}


# ---------------------------------------------------------------------------
# Location parsing
Expand Down Expand Up @@ -242,34 +244,11 @@ def _build_permalink(


# ---------------------------------------------------------------------------
# OWASP severity derivation
# OWASP severity derivation (shared with the renderer via severity_util)
# ---------------------------------------------------------------------------
def _derive_overall(finding: dict[str, Any]) -> float | None:
"""Arithmetic mean of risk + impact + scope when all three are numeric floats."""
dims = []
for key in ("risk", "impact", "scope"):
value = finding.get(key)
if not isinstance(value, (int, float)) or isinstance(value, bool):
return None
dims.append(float(value))
return sum(dims) / 3.0


# Band table mirrors the plan §Standard adopted.
_SEVERITY_BANDS: list[tuple[float, int]] = [
(0.9, 5),
(0.7, 4),
(0.4, 3),
(0.1, 2),
]


def _derive_severity_int(overall: float) -> int:
"""Map an overall_severity float to the 1..5 integer severity band."""
for threshold, level in _SEVERITY_BANDS:
if overall >= threshold:
return level
return 1
# Re-exported under the legacy private names the test-suite imports.
_derive_overall = derive_overall
_derive_severity_int = derive_severity_int


# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -719,15 +698,19 @@ def cmd_prepare(args: argparse.Namespace) -> int:
# hard cutover: v1/v2 must be rejected with a pointer at the schema.
if isinstance(data, dict):
declared = data.get("schema_version")
if isinstance(declared, str) and declared and declared != SCHEMA_VERSION:
if (
isinstance(declared, str)
and declared
and declared not in ACCEPTED_SCHEMA_VERSIONS
):
log.error(
"Input %s declares schema_version=%r; only %r is accepted. "
"Input %s declares schema_version=%r; only %s are accepted. "
"v1/v2 reports are no longer supported — re-run the "
"producer against the current commit to regenerate. See "
"schemas/review-report.schema.json v%s.",
path_str,
declared,
SCHEMA_VERSION,
sorted(ACCEPTED_SCHEMA_VERSIONS),
SCHEMA_VERSION,
)
return 2
Expand Down
Loading