fix(plugins-google): thought_signature dropped for version-less Gemini model aliases by ngoanpv · Pull Request #6011 · livekit/agents

ngoanpv · 2026-06-08T17:05:52Z

Problem

Multi-turn function calling fails with 400 INVALID_ARGUMENT: "Function call is missing a thought_signature in functionCall parts" when the LLM model is a version-less Gemini alias such as gemini-flash-latest or gemini-flash-lite-latest (these resolve to Gemini 3 server-side). It works on the first turn and breaks on the first follow-up that echoes a function call back.

Root cause

_requires_thought_signatures(model) gates BOTH storing (_parse_part) and resending (_run) of thought_signature, but only matches literal gemini-2.5* / gemini-3* strings. Version-less aliases return False, so the signature is never stored or resent — and Gemini 3 requires it echoed back.

Fix

Make signature handling response-driven instead of model-name-driven:

store a thought_signature whenever the API returns one (_parse_part);
resend whatever was stored, whenever present (_run).

Correct for every signature-emitting model + alias; a no-op for models that never emit them (nothing stored → nothing resent). Signature handling at runtime no longer depends on the _requires_thought_signatures model-name heuristic. The helper itself is retained (its alias handling corrected so gemini-flash-latest / gemini-flash-lite-latest also return True) because the existing unit tests in tests/test_google_thought_signatures.py import and assert on it.

Model coverage note

The current latest flash models are gemini-3.5-flash and gemini-3.1-flash-lite. Both already match the gemini-3 detection, so — unlike the version-less -latest aliases — they were already storing and resending signatures before this change; the response-driven fix keeps them correct. The fix also covers the -latest aliases automatically (they no longer depend on the name heuristic at runtime), while the retained helper additionally recognizes them so the unit tests stay accurate. The dropped-signature 400 was specific to the version-less -latest aliases.

Test

Verified live against the real Gemini API, driving the plugin's llm.LLM through a 2-turn function-calling exchange with a single save_answer(answer: str) tool (turn 1 forces the tool call; turn 2 echoes the FunctionCall + FunctionCallOutput back, reusing the harvested call_id).

`gemini-flash-lite-latest` (the version-less alias bug)

Before (unpatched): turn 1 stores nothing (LLM._thought_signatures is empty), turn 2 raises:

APIStatusError status_code=400 INVALID_ARGUMENT
"Function call is missing a thought_signature in functionCall parts. This is required
for tools to work correctly... function call `default_api:save_answer`, position 2."

After (patched): turn 1 stores the signature (LLM._thought_signatures keys = ['<call_id>']), turn 2 succeeds with a normal assistant response — no 400. Same script, same model, only the plugin changed.

`gemini-3.5-flash` (latest flash)

Also verified live for gemini-3.5-flash. Because this model matches the gemini-3 detection in both the original and patched code, the runtime gate is True either way, so it never hit the dropped-signature 400. With the patched plugin, multi-turn function calling succeeds: turn 1 stores the signature (LLM._thought_signatures keys = ['<call_id>']), turn 2 returns a normal assistant response — no 400. The response-driven fix keeps gemini-3.5-flash (and gemini-3.1-flash-lite) correct.

Unit tests

The existing unit cases in tests/test_google_thought_signatures.py still pass unchanged. Added _requires_thought_signatures → True cases for the current latest models gemini-3.5-flash and gemini-3.1-flash-lite, plus the version-less aliases gemini-flash-latest and gemini-flash-lite-latest (38 passed).

Follow-up fix — model-detection gaps in `thinking_config` and flash defaults

The same fragile string matching affected two sibling helpers in llm.py, surfaced during review:

_is_gemini_3_model / _is_gemini_3_flash_model only matched literal gemini-3* / gemini-3-flash*. They returned False for the version-less aliases gemini-flash-latest / gemini-flash-lite-latest (which resolve to Gemini 3.x flash server-side) and mis-handled gemini-3.5-flash.
Consequence 1: the thinking_config block called _is_gemini_3_model(model); for the aliases it took the "Gemini 2.5 and earlier" branch and raised ValueError("does not support thinking_level"), even though those aliases support thinking_level.
Consequence 2: _is_gemini_3_flash_model("gemini-3.5-flash") returned False, so 3.5-flash missed the "minimal" flash thinking default and fell back to "low".

Fix

A shared _GEMINI_3_FLASH_ALIASES constant plus alias/3.x-aware helpers:

_is_gemini_3_model now matches any gemini-3 substring (covers gemini-3.5-flash, gemini-3.1-flash-lite) and the version-less aliases.
_is_gemini_3_flash_model is True for any 3.x flash model and both aliases.
The thinking_config block is unchanged — correcting the helpers makes the aliases take the Gemini-3 branch (no ValueError) and gives gemini-3.5-flash the "minimal" flash default.

Tests

Expanded the parametrized cases in tests/test_google_thought_signatures.py for all three helpers to cover the aliases, gemini-3.5-flash, and gemini-3.1-flash-lite (e.g. _is_gemini_3_model("gemini-flash-latest") → True, _is_gemini_3_flash_model("gemini-3.5-flash") → True, plus gemini-3-pro-preview → False to keep pro models out of the flash path).

…ni aliases

CLAassistant · 2026-06-08T17:05:59Z

All committers have signed the CLA.

…ponse-driven store)

…ught_signature handling

…n model helpers (thinking_config + flash defaults)

ngoanpv · 2026-06-08T17:49:58Z

The latest commit (ea985d9) resolves the flagged model-detection findings in llm.py:

_is_gemini_3_model / _is_gemini_3_flash_model now recognize the version-less aliases gemini-flash-latest / gemini-flash-lite-latest (via a shared _GEMINI_3_FLASH_ALIASES constant) and the gemini-3.5-flash / gemini-3.1-flash-lite point releases.
As a result, the thinking_config block no longer raises ValueError("does not support thinking_level") for those aliases — they now correctly take the Gemini-3 branch — and gemini-3.5-flash gets the "minimal" flash thinking default instead of falling back to "low".

Expanded the parametrized cases in tests/test_google_thought_signatures.py for all three helpers to cover the aliases, gemini-3.5-flash, and gemini-3.1-flash-lite.

devin-ai-integration

Devin Review found 1 new potential issue.

View 3 additional findings in Devin Review.

devin-ai-integration · 2026-06-08T17:53:24Z

🚩 _thought_signatures dict grows unboundedly across chat sessions

The _thought_signatures dict at llm.py:249 lives on the LLM instance and accumulates entries from every _parse_part call across all LLMStream instances. There is no eviction mechanism. For long-running agents with many multi-turn function-calling interactions, this dict could grow unboundedly. This is a pre-existing issue (not introduced by this PR) but worth noting since the PR's removal of the model-name guard technically makes it possible for any model that emits thought_signatures to contribute entries (in practice, only the same set of models as before).

(Refers to line 249)

Was this helpful? React with 👍 or 👎 to provide feedback.

…tup timeout; unrelated to this change)

fix(plugins-google): preserve thought_signature for version-less Gemi…

a1b018e

…ni aliases

This comment was marked as resolved.

Sign in to view

fix(plugins-google): narrow thought_signature type for mypy (keep res…

91746e2

…ponse-driven store)

This comment was marked as resolved.

Sign in to view

ngoanpv added 2 commits June 9, 2026 00:34

test(plugins-google): cover gemini-3.5-flash + 3.1-flash-lite for tho…

dc1e35f

…ught_signature handling

fix(plugins-google): detect version-less aliases + gemini-3.5-flash i…

ea985d9

…n model helpers (thinking_config + flash defaults)

devin-ai-integration Bot reviewed Jun 8, 2026

View reviewed changes

ci: re-trigger flaky room-server tests (test_room LiveKit server star…

8562923

…tup timeout; unrelated to this change)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(plugins-google): thought_signature dropped for version-less Gemini model aliases#6011

fix(plugins-google): thought_signature dropped for version-less Gemini model aliases#6011
ngoanpv wants to merge 5 commits into
livekit:mainfrom
ngoanpv:fix/google-thought-signature-version-less-aliases

ngoanpv commented Jun 8, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Jun 8, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

ngoanpv commented Jun 8, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ngoanpv commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Fix

Model coverage note

Test

gemini-flash-lite-latest (the version-less alias bug)

gemini-3.5-flash (latest flash)

Unit tests

Follow-up fix — model-detection gaps in thinking_config and flash defaults

Fix

Tests

Uh oh!

CLAassistant commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

ngoanpv commented Jun 8, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngoanpv commented Jun 8, 2026 •

edited

Loading

`gemini-flash-lite-latest` (the version-less alias bug)

`gemini-3.5-flash` (latest flash)

Follow-up fix — model-detection gaps in `thinking_config` and flash defaults

CLAassistant commented Jun 8, 2026 •

edited

Loading