Skip to content

fix(accounting): dropped records fail the run; JSON read-layer fails fast (#234, #235)#243

Open
LukasWodka wants to merge 1 commit into
developfrom
fix/failure-accounting-and-consistency
Open

fix(accounting): dropped records fail the run; JSON read-layer fails fast (#234, #235)#243
LukasWodka wants to merge 1 commit into
developfrom
fix/failure-accounting-and-consistency

Conversation

@LukasWodka

Copy link
Copy Markdown
Collaborator

Summary

PR2 of the coercion-cluster fix. Closes #234 and resolves the consistency requirement of #235.

Both issues are facets of the same silent-data-loss class #99/#223/#230 have been closing: a record gets dropped, but the run still exits 0 and the K8s Job is marked Succeeded.

A surprising finding that reshaped the fix

The empirical end-to-end behaviour contradicts #235's one-line framing. Through the real ingest() path, both CSV and JSON already fail-fast at the DataValidator gate on a bad cell:

CSV  one-bad-cell: RAISED ValueError: Data Validator Validator failed
JSON one-bad-cell: RAISED ValueError: Data Validator Validator failed   ← not a silent skip

The "JSON silently skips" behaviour only occurs at the read_data level (bypassing the gate) or for records the gate misses. So the genuine, dominant bug is #234's silent skips, and #235's consistency requirement is met by making the read-layer fail-fast like the gate already does.

#234 — close the silent-skip channels

  • A record dropped by process_record (blank/invalid unique_id, or a processing error) is now appended to failed_records, so run_ingestion exits non-zero instead of returning [] and reporting clean success.
  • IngestionSummary.has_failures and the summary's total_failures now include skipped_records — fixing the self-contradiction where a run printed "0 failure(s)" while the success-rate text said "most records failed".
  • An invalid intent now fails fast (raises before any lock/DB work). It's a run-wide config error, not a per-row skip; previously it silently dropped every record and still exited 0.

#235 — make CSV and JSON consistent

  • JSONIngestor._iter_validated_records now raises on a type-invalid / non-object record instead of silently skipping it — consistent with the CSV cast and the DataValidator gate. A bad record aborts with a clear non-zero exit instead of vanishing with only a log line nobody durably sees.

Verification

  • tests/test_ingestor_base.py: split the old "skips on bad intent" test into test_ingest_fails_fast_on_invalid_intent + test_ingest_counts_dropped_record_as_failure, and added test_ingest_keeps_good_records_and_counts_dropped (the canonical mixed-run scenario: good rows ingest, dropped rows force non-zero exit, summary trips has_failures). Added a skipped_records case to the has_failures matrix.
  • tests/test_json_ingestor.py: the two "skips bad record" tests rewritten to assert the fail-fast raise.
  • Full suite: 965 passed, 1 xfailed (pre-existing semseg feat: wire segmentation mask sidecars through the declarative path #136), coverage 97.2% (gate 95%).

Deferred (intentional)

Per-row tolerance — keep the good rows, drop+count the bad ones, the "preferred direction" noted in #235 — is not in this PR. It requires the DataValidator gate to stop blocking on per-cell type errors (it currently aborts, and ~40 validator tests assert that). That's a change to the gate's role, best done as part of the validator/registry refactor rather than bundled into this correctness fix. This PR delivers the hard requirement (consistent behaviour + no partial run reported as success); the per-row UX improvement is a clean follow-up on top.

Relationship to PR #242

Independent of #242 (the coercion source-of-truth). The two touch different functions (base.py accounting + json_ingestor read-loop here; coercion module + casts there) and merge cleanly.

🤖 Generated with Claude Code

…fast (#234, #235)

Records dropped by the skip paths never reached failed_records, so
run_ingestion exited 0 and the K8s Job was marked Succeeded even when rows
were silently lost (#234). And JSON's read layer silently skipped a bad
record while CSV aborts, so the two formats behaved inconsistently (#235).

#234 — close the silent-skip channels:
- A record dropped by process_record (blank/invalid unique_id, or a
  processing error) is now appended to failed_records, so the run exits
  non-zero instead of reporting clean success.
- IngestionSummary.has_failures and the summary's total_failures now include
  skipped_records — fixing the contradiction where a run printed
  "0 failure(s)" while the success-rate text said "most records failed".
- An invalid intent now fails fast (raises before any lock/DB work). It's a
  run-wide config error, not a per-row skip; previously it silently dropped
  every record and still exited 0.

#235 — make CSV and JSON consistent:
- Through the real ingest() path BOTH formats already fail-fast at the
  DataValidator gate on a bad cell; only JSON's read-layer diverged by
  silently skipping records the gate missed. _iter_validated_records now
  raises (fail-fast) like the CSV cast and the gate, so a bad record aborts
  with a clear non-zero exit instead of vanishing.

Per-row tolerance (keep good rows, drop+count bad ones — the issue's
"preferred direction") is intentionally deferred: it requires the
DataValidator gate to stop blocking on per-cell errors, a change to the
gate's role best done with the validator/registry refactor rather than
bundled into this correctness fix.

Closes #234.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@LukasWodka

Copy link
Copy Markdown
Collaborator Author

👋 Heads-up — Code review queue is at 11 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant