feat(utils): add prefetch to get_video_frames_generator#2273
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #2273 +/- ##
=======================================
Coverage 78% 78%
=======================================
Files 66 66
Lines 8412 8451 +39
=======================================
+ Hits 6580 6614 +34
- Misses 1832 1837 +5 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds an opt-in prefetch parameter to get_video_frames_generator to overlap frame decoding with CPU-bound consumers by decoding frames on a background thread and buffering them in a bounded Queue. This targets the FPS bottleneck raised in #1411 while keeping the default (prefetch=0) behavior unchanged.
Changes:
- Extend
get_video_frames_generator(..., prefetch: int = 0)with a threaded prefetch path whenprefetch > 0. - Add internal
_prefetched_frames_generatorthat drives the existing synchronous generator from a daemon thread and yields frames from a queue. - Add a regression test ensuring prefetched output matches the synchronous frame sequence exactly.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/supervision/utils/video.py |
Adds prefetch option and a new internal queued background-reader generator. |
tests/utils/test_video.py |
Adds a test asserting prefetched iteration matches the synchronous generator frame-for-frame. |
| def reader() -> None: | ||
| try: | ||
| for frame in get_video_frames_generator( | ||
| source_path=source_path, | ||
| stride=stride, | ||
| start=start, | ||
| end=end, | ||
| iterative_seek=iterative_seek, | ||
| prefetch=0, | ||
| ): | ||
| frame_queue.put(frame) | ||
| finally: | ||
| frame_queue.put(None) | ||
|
|
There was a problem hiding this comment.
Good catch. The reader now captures Exception, parks it as a sentinel on the queue, and the outer generator re-raises it before yielding the next frame. Added a regression test that points the generator at a missing file and asserts the error reaches the consumer.
| thread = threading.Thread(target=reader, daemon=True) | ||
| thread.start() | ||
| while True: | ||
| frame = frame_queue.get() | ||
| if frame is None: | ||
| break | ||
| yield frame | ||
|
|
There was a problem hiding this comment.
Fixed. Added a threading.Event that the outer generator sets in a finally block when the consumer stops or raises. The reader checks it both before each frame and inside the put() loop, and uses a 0.1s put timeout so a full queue does not leak the thread. Added a regression test that breaks out after three frames and then iterates the same file again to confirm nothing blocks.
Adds an opt-in prefetch: int = 0 parameter. When > 0, frames are decoded in a background thread and buffered in a bounded queue, letting a CPU-bound consumer overlap with decode I/O. Default 0 keeps the original synchronous behaviour unchanged. The threaded path drives the existing sync generator on a daemon thread and pumps frames through a Queue(maxsize=prefetch). No new dependencies. Closes roboflow#1411.
bd3f788 to
ba1f44f
Compare
|
Friendly ping. @Borda, would you have a moment to take a look? Happy to address any feedback. |
Closes #1411.
@LinasKo asked for a worked threading example that produces a real FPS improvement for the
get_video_frames_generatorpath. This PR adds an opt-inprefetch: int = 0argument: when> 0, frames are decoded in a background thread and buffered in a bounded queue, so a CPU-bound consumer can overlap with decode I/O.Default stays
0(synchronous, behaviour unchanged). The threaded path is a thin wrapper that drives the existing sync generator on onedaemon=Truethread and pushes frames through aQueue(maxsize=prefetch). No new dependencies, ~30 added lines invideo.py. Pattern matches the reader-thread already inprocess_videofurther down the same file.Benchmark
150-frame 1080p h.264 video, fixed CPU consumer simulated with
time.sleep:Decode alone on this video is ~10 ms/frame, so the speed-up is largest when the consumer cost is roughly decode-bound. Heavier consumers asymptote to no benefit, which is the right behaviour.
Test
test_get_video_frames_generator_prefetch_matches_syncruns the generator twice on the same dummy video withprefetch=0andprefetch=4and asserts the two outputs are frame-for-frame identical. Fullpytest src/ tests/is green (1859 passed). Pre-commit clean.