Skip to content

fix bench restore output handling#529

Open
Poyraxx wants to merge 4 commits into
frappe:developfrom
Poyraxx:codex/fix-bench-restore-output
Open

fix bench restore output handling#529
Poyraxx wants to merge 4 commits into
frappe:developfrom
Poyraxx:codex/fix-bench-restore-output

Conversation

@Poyraxx
Copy link
Copy Markdown

@Poyraxx Poyraxx commented Jun 1, 2026

What changed

  • added a JSON output parser that can recover the actual payload from noisy bench stdout
  • used that parser where site-level bench commands return JSON-like output
  • enabled --verbose during restore so long-running backup restores show more progress
  • added regression tests for noisy installed-app output and restore verbosity

Why

Restores could fail when custom apps printed to stdout before the JSON payload, because the agent expected perfectly clean JSON. Heavy restore jobs also provided too little progress visibility.

Impact

  • restore and app cleanup flows are more resilient to extra stdout logs
  • users see more detailed restore output during long backup restores

Validation

  • python -m unittest agent.tests.test_site

@Poyraxx Poyraxx changed the title [codex] fix bench restore output handling fix bench restore output handling Jun 1, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

This PR introduces a parse_json_output helper that recovers a JSON payload from bench command output that may be prefixed or followed by log noise. It replaces four bare json.loads calls in site.py with calls to the new helper, each guarded by a type-specific validator, and adds --verbose to the restore command for better progress visibility.

  • parse_json_output in agent/utils.py scans for all [/{ positions with JSONDecoder.raw_decode, classifies matches as trailing (nothing after) or dirty (content follows), raises on ambiguity, and prefers a unique trailing match — all callers pass validators specific enough to reject accidental log-line matches.
  • agent/site.py migrates uninstall_unavailable_apps, get_analytics, describe_database_table, and apps_as_json to the new helper, with validators that require expected top-level keys/types.
  • Seven new tests in agent/tests/test_site.py cover noisy-log extraction, validator-driven disambiguation, ambiguity detection, the --verbose restore flag, and end-to-end flows for all four updated site methods.

Confidence Score: 5/5

Safe to merge; all changed code paths are exercised by the new tests and the logic is correct for every realistic bench output shape.

The new parse_json_output function behaves correctly for all current callers: validators are specific enough to reject log-line noise, ambiguity detection raises rather than guessing wrong, and the trailing-preference logic is consistent with how bench commands actually emit their output. Tests cover the key edge cases including a multi-JSON noisy output with validator filtering and the ambiguous-match raise path. The one design note — that omitting the validator while bench appends JSON-formatted status lines can return the wrong fragment — is captured inline and does not affect any current caller.

No files require special attention.

Important Files Changed

Filename Overview
agent/utils.py Adds parse_json_output utility that scans command output for JSON fragments, preferring a clean-trailing match and using a validator to disambiguate; logic is correct for all current call sites.
agent/site.py Replaces four bare json.loads calls with parse_json_output plus specific validators, and adds --verbose to the restore command; all validators are tight enough to reject accidental log-line matches.
agent/tests/test_site.py Adds seven tests covering noisy-log extraction, validator-driven disambiguation, ambiguity detection, verbose-restore flag, and end-to-end flows for uninstall_unavailable_apps, apps_as_json, and get_analytics.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[parse_json_output called] --> B{json.loads succeeds AND validator passes?}
    B -- Yes --> C[Return value directly]
    B -- No --> D[Scan output for JSON start chars]
    D --> E{raw_decode + validator passes?}
    E -- No --> F{More chars to scan?}
    F -- Yes --> D
    F -- No --> G{Any candidate found?}
    E -- Yes --> H{Nothing after match?}
    H -- Yes --> I{trailing_candidate already set?}
    I -- Yes --> J[Raise Ambiguous JSON]
    I -- No --> K[Set trailing_candidate, continue scan]
    K --> F
    H -- No --> L{dirty_candidate already set?}
    L -- Yes --> M[Raise Ambiguous JSON]
    L -- No --> N[Set dirty_candidate, continue scan]
    N --> F
    G -- trailing found --> O[Return trailing_candidate]
    G -- dirty only --> P[Return dirty_candidate]
    G -- none found --> Q[Raise original_error]
Loading

Reviews (4): Last reviewed commit: "narrow installed app payload validation" | Re-trigger Greptile

Comment thread agent/utils.py Outdated
@Poyraxx Poyraxx marked this pull request as ready for review June 1, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant