Skip to content

feat(ci): daily flake report agent#16417

Draft
nihal467 wants to merge 3 commits into
developfrom
feat/weekly-flake-report
Draft

feat(ci): daily flake report agent#16417
nihal467 wants to merge 3 commits into
developfrom
feat/weekly-flake-report

Conversation

@nihal467

@nihal467 nihal467 commented Jun 2, 2026

Copy link
Copy Markdown
Member

Approach

A daily agentic workflow (.github/workflows/flake-report.md, built on gh-aw) runs every day and:

  1. Pulls the last 7 days of playwright.yaml workflow runs via paginated gh api.
  2. Downloads each run's playwright-results-shard-* artifacts and parses test-results.json.
  3. Normalises error signatures (UUIDs, timestamps, dynamic IDs stripped) and clusters by (test_id, signature).
  4. Keeps only clusters meeting the escalation threshold: ≥3 distinct PRs OR ≥5 occurrences. Single-PR failures are dropped as contributor-specific.
  5. Classifies each cluster (Flaky / Infrastructure / Test Data / Product Bug / Dependency / Unknown).
  6. Opens one tracking issue with per-cluster occurrence counts, daily sparkline (last 7 days), affected PRs, and sample trace links.
  7. Opens one Draft PR fixing all auto-fix-eligible clusters (Flaky / Infra / Test Data) using strict Playwright patterns: waitForResponse, waitForURL, web-first assertions, role-based selectors (getByRole / getByLabel / getByText), per-worker fixtures. Never adds sleeps, skips tests, weakens assertions, or adds data-testid / testid attributes to application code. Never touches src/ or healthcare/clinical logic.
  8. Verifies open reports from the previous 7 days by checking whether their clusters still appear; comments ✅ Resolved / 📉 Reduced / 🔁 Still present and closes issues whose clusters are all resolved.

Dedupe: before opening a new issue or PR, the agent checks for an existing open [flake-report] issue/Draft PR from the last 3 days covering the same clusters and comments on the existing artifact instead.

Working

When What happens
Every day (fuzzy schedule) Workflow triggers, agent runs Phases 1–5
Continuously Maintainers review the Draft PR(s), merge or close
Next day Verification updates previous reports; new clusters open new artifacts

Cadence note: running daily during the shakedown period. The analysis window stays at 7 days (rolling) for stable signal. Cadence will move to weekly once the workflow is proven reliable.

If a day has zero qualifying clusters, no issue or PR is created — silent on quiet days. If >15 clusters appear (regression spike), the issue is tagged needs-triage and no PR is opened.

Changes

  • .github/workflows/flake-report.md — new agentic workflow
  • .github/workflows/flake-report.lock.yml — auto-generated by gh aw compile
  • .github/workflows/playwright.yaml — artifact retention-days: 7 → 30 so the 7-day analysis window is reliably full
  • .github/dependabot.yml — ignore github/gh-aw-actions/** (version-locked to gh-aw compiler)
  • .github/aw/actions-lock.json — pinned SHAs for new actions referenced by the lock file

Merge Checklist

  • Linting Complete
  • Any other necessary step

Adds a Monday agentic workflow that analyses the previous 7 days of
Playwright CI runs, clusters recurring failures, and opens a single
Draft PR fixing flakes that meet the escalation threshold.

Bumps Playwright artifact retention 7 -> 30 days so the 7-day window
is reliably full.
Copilot AI review requested due to automatic review settings June 2, 2026 12:12
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c93d38cc-960e-4e98-a3af-a1c8e91a99ff

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/weekly-flake-report

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

⚠️ Merge Checklist Incomplete

Thank you for your contribution! To help us review your PR efficiently, please complete the merge checklist in your PR description.

Your PR will be reviewed once you have marked the appropriate checklist items.

To update the checklist:

  • Change - [ ] to - [x] for completed items
  • Only check items that are relevant to your PR
  • Leave items unchecked if they don't apply

The checklist helps ensure code quality, testing coverage, and documentation are properly addressed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a weekly agentic GitHub workflow (built on gh-aw) that analyses the previous 7 days of Playwright CI runs, clusters recurring test failures, opens a tracking issue, and opens a single Draft PR with auto-fixes for flake/infra/test-data clusters. Bumps Playwright artifact retention to 30 days so the 7-day analysis window is reliably populated, and pins the new gh-aw–managed actions.

Changes:

  • New weekly-flake-report.md agentic workflow + auto-generated .lock.yml.
  • playwright.yaml: artifact retention-days 7 → 30.
  • dependabot.yml ignores github/gh-aw-actions/**; actions-lock.json adds pinned SHAs for the new actions.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
.github/workflows/weekly-flake-report.md New gh-aw prompt defining the 5-phase weekly flake-report agent.
.github/workflows/weekly-flake-report.lock.yml Auto-generated compiled workflow (do not edit).
.github/workflows/playwright.yaml Increases test-results artifact retention to 30 days.
.github/dependabot.yml Reformatted; ignores gh-aw-managed actions from Dependabot.
.github/aw/actions-lock.json Pins SHAs for actions/github-script@v9.0.0 and github/gh-aw-actions/setup@v0.77.5.

opens a single Draft PR fixing all flakes that meet the escalation
threshold (≥3 distinct PRs OR ≥5 occurrences).
on:
schedule: weekly on monday
Comment on lines +80 to +83
gh api --paginate \
"repos/ohcnetwork/care_fe/actions/workflows/playwright.yaml/runs?per_page=100&created=>=$SINCE&status=completed" \
--jq '.workflow_runs[] | {id, run_number, head_branch, event, conclusion, created_at, pull_requests: [.pull_requests[].number], head_sha}' \
> /tmp/gh-aw/agent/runs.ndjson
nihal467 added 2 commits June 2, 2026 17:59
Project convention: never add test-id / data-testid attributes to
application source. Update the agent prompt to use role-based
selectors (getByRole/getByLabel/getByText) and to leave any
cluster that would require a test id for human triage.
…based IDs

- Cadence: weekly -> daily (analysis window stays at 7 days for stable signal)
- Rename file/branch/title prefixes from weekly-flake-report -> flake-report
- IDs: ISO week (YYYY-WW) -> date (YYYY-MM-DD)
- Add dedupe: agent checks for existing open flake-report issue/PR before opening another
- Quiet days no longer create empty issues
- Verification looks at last 7 days of open reports, not just last week
Copilot AI review requested due to automatic review settings June 2, 2026 12:32
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 2, 2026

Copy link
Copy Markdown

Deploying care-preview with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5aaf3bf
Status: ✅  Deploy successful!
Preview URL: https://e2ca0230.care-preview-a7w.pages.dev
Branch Preview URL: https://feat-weekly-flake-report.care-preview-a7w.pages.dev

View logs

@nihal467 nihal467 changed the title feat(ci): weekly flake report agent feat(ci): daily flake report agent Jun 2, 2026
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

🎭 Playwright Test Results

Status: ❌ Failed
Test Shards: 3

Metric Count
Total Tests 317
✅ Passed 316
❌ Failed 1
⏭️ Skipped 0

📊 Detailed results are available in the playwright-final-report artifact.

Run: #9323

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines +7 to +9
on:
schedule: daily
workflow_dispatch:
Comment on lines +84 to +85
gh api --paginate \
"repos/ohcnetwork/care_fe/actions/workflows/playwright.yaml/runs?per_page=100&created=>=$SINCE&status=completed" \
@github-actions github-actions Bot added the stale label Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants