test(e2e): drop stale semantic_segmentation xfail — mask sidecar is wired (#136)#277
Open
LukasWodka wants to merge 1 commit into
Open
test(e2e): drop stale semantic_segmentation xfail — mask sidecar is wired (#136)#277LukasWodka wants to merge 1 commit into
LukasWodka wants to merge 1 commit into
Conversation
…ired (#136) The declarative path stages the per-row mask sidecar end-to-end: cli/run -> map_file_transfer -> semantic_segmentation transfer factory -> mask_transfer. mask_id is preserved through process_record (#212) and popped before the DB insert; the mask lands in DEST_PATH alongside its image. The e2e case had been XPASSing, which the suite's own convention says signals the fix landed and the xfail can be removed. Verified end-to-end against real MySQL: 3 images + 3 masks staged (byte-identical copies), 3 rows inserted, mask_id absent from the table. A masks-removed negative control fails the run, confirming mask staging is load-bearing. Full e2e suite green (24 passed). #136 was already closed (fix shipped via #212 + the P3c transfer registry); this removes the leftover test scaffolding. Also refreshes e2e/README.md, whose "known gaps (xfail)" table was stale for all three listed modalities (#135 / #137 were un-xfailed earlier). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
👋 Heads-up — Code review queue is at 13 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Removes the last
xfailmarker in the e2e ingestion suite — thesemantic_segmentationcase (#136) — turning it into a normal passing case,and refreshes
e2e/README.md's now-stale "known gaps" section.Why
The e2e harness marks each known gap
xfail(strict=False)against its trackingticket; the file's own convention is that when the fix lands the test XPASSes
and the mark can be dropped. The semseg case has been XPASSing — the
declarative mask-sidecar wiring shipped a while ago (
mask_idpreserved throughprocess_record, #212, plus the P3c transfer registry), and #136 is alreadyclosed. This PR just removes the leftover scaffolding.
Verification (real MySQL, declarative
cli/runpath)semantic_segmentationnow reportsPASSED(not xfail/xpass).(
cli/run→map_file_transfer→semantic_segmentationtransfer factory →mask_transfer):DEST_PATH; the masks arebyte-identical (md5) to source — i.e.
mask_transfergenuinely copiedthem (it is the only writer of
DEST_PATH/*_mask.png).label/filename/extension;mask_idcorrectly absent as a DB column (popped before insert in_process_batch).masks/emptied, the run fails (rc=1, 0 rows, nomasks staged) — proving mask staging is load-bearing, not coincidental.
Notes
e2e/README.md's "Known gaps (currentlyxfail)" table was stale for allthree listed modalities:
object_detection(fix: validator/UX gaps (reserved id, OD difficult=2, TS leading-NaN, image defaults) #135) andmasked_language_modeling(fix: MLM template missing tokenizer.json (can't ingest itself) #137) were un-xfailed in earlier PRs without theREADME being updated. Rewrote it to reflect that the suite now has zero xfail
cases, and kept the convention note for future gaps.
transfer registry), so no auto-close keyword here — this is the test-cleanup
follow-up.
🤖 Generated with Claude Code