feat(dataset): add CreateML format support to DetectionDataset#2284
Open
madhavcodez wants to merge 4 commits into
Open
feat(dataset): add CreateML format support to DetectionDataset#2284madhavcodez wants to merge 4 commits into
madhavcodez wants to merge 4 commits into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #2284 +/- ##
=======================================
Coverage 79% 79%
=======================================
Files 66 67 +1
Lines 8569 8638 +69
=======================================
+ Hits 6806 6866 +60
- Misses 1763 1772 +9 🚀 New features to boost your workflow:
|
Add DetectionDataset.from_createml and as_createml plus a new formats/createml.py module (load/save helpers), mirroring the existing COCO, YOLO, and Pascal VOC format support. Boxes use CreateML's pixel-space centre + width/height and are converted to/from xyxy; class names are inferred from the labels present in the file. Image paths are validated against the images directory, matching the COCO loader's path-traversal protection. Adds unit tests for the helpers, loader, exporter, integer/float round-trip, global class-id consistency, and the path-safety guards.
Cast the JSON payload read via read_json_file to list[CreateMLDict] and the data passed to save_json_file to dict[str, Any] (both helpers are annotated for dict only), and iterate xyxy/class_id arrays directly so the class_id None-guard narrows the loop variable for mypy.
…ateml # Conflicts: # docs/changelog.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds CreateML object-detection format support to
DetectionDataset, withfrom_createml()for loading andas_createml()for exporting.supervisionalready supports COCO, YOLO, and Pascal VOC; this fills the remaining common
format with a symmetric loader/exporter that mirrors those implementations.
Type of Change
Motivation and Context
CreateML is a widely used object-detection annotation format (Apple Create ML,
and one of Roboflow's dataset export options).
DetectionDatasetcan alreadyround-trip COCO, YOLO, and Pascal VOC, but not CreateML, so users exporting in
that format have to convert manually before loading into supervision. This adds
first-class support following the existing
from_<format>/as_<format>convention.
No existing tracking issue — opening as a feature addition; happy to file one if
the maintainers prefer.
Changes Made
src/supervision/dataset/formats/createml.py— new module:load_createml_annotations,save_createml_annotations, and the helperscreateml_annotations_to_detections/detections_to_createml_annotations.Boxes use CreateML's pixel-space centre + width/height and are converted
to/from
xyxy. Class names are inferred from the labels present in the fileand assigned sorted, zero-based ids. Image paths are validated against the
images directory (rejecting
..traversal, absolute paths, the directoryitself, and directory targets), matching the COCO loader's protection.
src/supervision/dataset/core.py—DetectionDataset.from_createml()andDetectionDataset.as_createml(), plus the format import. Method docstringsrender automatically in the API docs.
tests/dataset/formats/test_createml.py— unit tests for the conversionhelpers, loader, exporter, save→load round-trip (integer and float
coordinates), global class-id consistency across images, and the path-safety
guards.
Testing
Local run:
pytest tests/dataset/passes (including the newtest_createml.py),and
ruff check/ruff format --checkare clean on the changed files.