Release v0.3.10 (ingestion hardening + path-traversal fix + single-label preflight)#259
Merged
Conversation
Release bundles the fixes merged to develop since 0.3.9 (#230–#254): - Ingestion accounting: dropped records fail the run; JSON read-layer fails fast (#230, #234, #235) - Coercion: single source of truth for NA policy + int64 range (#236, #237) - Security: block path traversal via manifest filename/mask_id (#239) - UX papercuts: delimiter hint, NUL truncation, table-name message, Config numeric coercion (#238) - Schema/DB: real MySQL column types, CHAR(N) mapping, drop instance_segmentation from enum, empty-CSV fast fail (#240, #241, #249, #250) - DataValidator: accept single-dict / filter non-dict JSON (#232, #233) - Single-label classification caught at preflight + friendly backend reason, with the full bugbot-hardened LabelDiversityValidator (#251, #252) - CLI: schema descriptions surfaced in validation errors (#254) - Reporting: ConsoleRenderer extraction; packaging split (#248, #246) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
|
👋 Heads-up — Code review queue is at 16 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
aptracebloc
previously approved these changes
Jun 15, 2026
aptracebloc
approved these changes
Jun 15, 2026
5 tasks
This was referenced Jun 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Release v0.3.10. Bumps
__version__0.3.9 → 0.3.10. Bundles the fixes merged todevelopsince v0.3.9.Security
filename/mask_id(security/bug: path traversal via unsanitised 'filename' column in file_transfer (read + write outside SRC_PATH/DEST_PATH) #239 → fix(security): block path traversal via manifest filename/mask_id (#239) #244)Ingestion correctness & accounting
get_table_schema(fix: report real MySQL column types from get_table_schema (dialect-type reflection) #241)CHAR(N)maps to a SQLAlchemy type (fix(database): CHAR(N) must map to a SQLAlchemy type in _get_sqlalchemy_type #249)Single-label classification preflight (#251 → #252)
LabelDiversityValidatorcatches unlearnable single-class datasets at preflight and surfaces the backend reason — instead of a misleading "Backend failed to prepare the dataset" after rows land in MySQL.build_csv_na_valuesNA sentinels, string-dtype pin, full (unstripped) schema, and whitespace-insensitive column resolution.UX / CLI / packaging
ConsoleRendererextraction (refactor(P2): extract ingestion-summary renderer into reporting.ConsoleRenderer #248); runtime vs dev requirements split, numpy declared (refactor(packaging): split runtime vs dev requirements; declare numpy #246)instance_segmentationfrom the schema enum + category congruence guard (fix: remove instance_segmentation zombie category, add dispatch-site congruence guard #240)Test plan
pytestgreen onrelease/v0.3.10(local: 1043 passed, 1 xfailed)python setup.py sdist bdist_wheelbuilds cleanlypip install dist/tracebloc_ingestor-0.3.10-*.whlthenpython -c "import tracebloc_ingestor; print(tracebloc_ingestor.__version__)"→0.3.10🤖 Generated with Claude Code
Note
High Risk
Changes touch security-sensitive path joining, core ingest/validate/DB/API paths, and packaging/CI install graphs across many modalities—high blast radius despite strong test coverage.
Overview
v0.3.10 bumps the package version and bundles correctness, security, and UX fixes with a large expansion of tests and CI wiring.
Security & file handling: Manifest
filename/mask_idvalues are joined via_safe_joinso reads/writes cannot escapeSRC_PATH/DEST_PATH(#239).Ingestion behavior: Dropped or invalid records and empty CSVs fail the run instead of silent success; JSON read/validate paths align with CSV (fail-fast, single-dict JSON, filter non-objects). Shared NA coercion and int64 overflow checks keep validator and ingest layers in agreement.
get_table_schemamaps reflected MySQL dialect types correctly;CHAR(N)is supported in DDL. Mid-batch DB failures send only inserted rows to the API;skipped_recordscount towardhas_failures.Preflight: New
LabelDiversityValidatorrejects single-class classification datasets locally;prepare_datasetstasheslast_prepare_errorfor clearer errors (#251).instance_segmentationis removed from the schema;test_category_congruenceguards full dispatch wiring.Templates & packaging: All template scripts delegate to exported
run_ingestion; summary UI moves toConsoleRenderer. Runtime deps stay inrequirements.txt(adds explicit numpy); test/lint tools move torequirements-dev.txtwithsetup.pyfiltering comment lines. CI installsrequirements-dev.txt.Tests: New e2e characterization harness over bundled modalities; expanded unit/e2e coverage for the above; CLI schema errors prefer rule descriptions over raw JSON Schema mechanics (#254).
Reviewed by Cursor Bugbot for commit aabfa20. Bugbot is set up for automated code reviews on this repo. Configure here.