Add Google Gemini 3.5 VLM support by SkalskiP · Pull Request #2265 · roboflow/supervision

SkalskiP · 2026-05-21T09:24:35Z

Summary

Adds GOOGLE_GEMINI_3_5 to the VLM (and deprecated LMM) enum, reusing
the existing Gemini 2.5 response parser since the output format is identical.
Registers the new model in all lookup dicts (RESULT_TYPES,
REQUIRED_ARGUMENTS, ALLOWED_ARGUMENTS) and the from_vlm / from_lmm
dispatch logic.
Adds parametrized tests verifying VLM.GOOGLE_GEMINI_3_5 produces the same
detections as VLM.GOOGLE_GEMINI_2_5 for identical inputs.

…sponse parser.

codecov · 2026-05-21T09:27:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78%. Comparing base (cb25906) to head (8ad8602).

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #2265   +/-   ##
=======================================
  Coverage       78%     78%           
=======================================
  Files           66      66           
  Lines         8406    8408    +2     
=======================================
+ Hits          6524    6534   +10     
+ Misses        1882    1874    -8

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

Copilot

Pull request overview

Adds support for Google Gemini 3.5 as an additional Vision-Language Model option in Supervision’s Detections.from_vlm / deprecated from_lmm pathways by treating it as format-compatible with the existing Gemini 2.5 parser.

Changes:

Extend VLM (and deprecated LMM) enums and VLM validation lookup tables to include GOOGLE_GEMINI_3_5.
Update Detections.from_vlm / from_lmm dispatch to route Gemini 3.5 through the existing Gemini 2.5 parsing logic.
Add a parametrized regression test asserting Gemini 3.5 matches Gemini 2.5 outputs for identical inputs.

Review notes (scores):

Code quality: 4/5
Testing: 3/5
Documentation: 3/5

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`src/supervision/detection/vlm.py`	Adds `GOOGLE_GEMINI_3_5` to enums and registers it in VLM parameter/result validation tables.
`src/supervision/detection/core.py`	Extends `from_lmm` mapping and `from_vlm` dispatch so Gemini 3.5 uses the Gemini 2.5 parser path.
`tests/detection/test_vlm.py`	Adds parity test intended to ensure Gemini 3.5 produces the same detections as Gemini 2.5.

    Attributes:
        PALIGEMMA: Google's PaliGemma vision-language model.
        FLORENCE_2: Microsoft's Florence-2 vision-language model.
        QWEN_2_5_VL: Qwen2.5-VL open vision-language model from Alibaba.\
        QWEN_3_VL: Qwen3-VL open vision-language model from Alibaba.
        GOOGLE_GEMINI_2_0: Google Gemini 2.0 vision-language model.
        GOOGLE_GEMINI_2_5: Google Gemini 2.5 vision-language model.
+        GOOGLE_GEMINI_3_5: Google Gemini 3.5 vision-language model.
        MOONDREAM: The Moondream vision-language model.


    Attributes:
        PALIGEMMA: Google's PaliGemma vision-language model.
        FLORENCE_2: Microsoft's Florence-2 vision-language model.
        QWEN_2_5_VL: Qwen2.5-VL open vision-language model from Alibaba.
        QWEN_3_VL: Qwen3-VL open vision-language model from Alibaba.
        GOOGLE_GEMINI_2_0: Google Gemini 2.0 vision-language model.
        GOOGLE_GEMINI_2_5: Google Gemini 2.5 vision-language model.
+        GOOGLE_GEMINI_3_5: Google Gemini 3.5 vision-language model.
        MOONDREAM: The Moondream vision-language model.
    """


+        resolution_wh=resolution_wh,
+        classes=classes,
+    )
+    detections_3_5 = Detections.from_vlm(
+        vlm=VLM.GOOGLE_GEMINI_3_5,
+        result=result,
+        resolution_wh=resolution_wh,
+        classes=classes,


Add Google Gemini 3.5 VLM support, reusing the existing Gemini 2.5 re…

097ef71

…sponse parser.

Fix ruff PT006 lint by using tuple for pytest.mark.parametrize argument.

e99ca54

SkalskiP force-pushed the add-gemini-3.5-vlm-support branch from f450b17 to e99ca54 Compare May 21, 2026 09:39

fix(pre_commit): 🎨 auto format pre-commit hooks

4bd6cf7

Borda requested a review from Copilot May 22, 2026 19:26

Borda reviewed May 22, 2026

View reviewed changes

Comment thread tests/detection/test_vlm.py

Apply suggestions from code review

338b0ec

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>

Copilot started reviewing on behalf of Borda May 22, 2026 19:27 View session

Borda and others added 2 commits May 22, 2026 21:29

Update test cases in test_vlm.py for JSON inputs

1dd1def

fix(pre_commit): 🎨 auto format pre-commit hooks

4a56570

Borda reviewed May 22, 2026

View reviewed changes

Comment thread tests/detection/test_vlm.py Outdated

Merge branch 'develop' into add-gemini-3.5-vlm-support

8ad8602

Borda approved these changes May 22, 2026

View reviewed changes

Borda added the enhancement New feature or request label May 22, 2026

Copilot AI reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Google Gemini 3.5 VLM support#2265

Add Google Gemini 3.5 VLM support#2265
SkalskiP wants to merge 7 commits into
developfrom
add-gemini-3.5-vlm-support

SkalskiP commented May 21, 2026

Uh oh!

codecov Bot commented May 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SkalskiP commented May 21, 2026

Summary

Uh oh!

codecov Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 21, 2026 •

edited

Loading