feat: capture Gemini cache and thinking tokens in instrument_google_genai#1961
Open
JonathanTsen wants to merge 3 commits into
Open
feat: capture Gemini cache and thinking tokens in instrument_google_genai#1961JonathanTsen wants to merge 3 commits into
JonathanTsen wants to merge 3 commits into
Conversation
…enai Patches the upstream OTel GoogleGenAiSdkInstrumentor to also extract cached_content_token_count, thoughts_token_count and tool_use_prompt_token_count from response.usage_metadata, and computes operation.cost via genai-prices when available. Closes the gap where direct google.genai users (not going through pydantic-ai) were missing cache and thinking metrics in their spans.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Adds a test that exercises the early-return branch when the Gemini response has no usage_metadata, and marks the defensive `usage_data.model is not None` check (unreachable in practice: the google extractor raises LookupError for unknown models rather than returning model=None) with `# pragma: no branch`. Restores 100% coverage broken by the previous commit.
References the official Gemini API pages for context caching, thinking tokens, function calling, and pricing, plus the python-genai field description that documents prompt_token_count already including cached tokens.
Contributor
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docs/integrations/llms/google-genai.md">
<violation number="1" location="docs/integrations/llms/google-genai.md:59">
P2: Documentation for `operation.cost` should clarify that the attribute is only present when `genai-prices` is installed and the model pricing is known, since the implementation silently skips cost calculation otherwise.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
| - `gen_ai.usage.cache_read.input_tokens` — tokens served from [context cache](https://ai.google.dev/gemini-api/docs/caching) (cache hit) | ||
| - `gen_ai.usage.details.thoughts_tokens` — [reasoning tokens](https://ai.google.dev/gemini-api/docs/thinking) (Gemini 2.5 / 3.x) | ||
| - `gen_ai.usage.details.tool_use_prompt_tokens` — tokens used for [tool definitions](https://ai.google.dev/gemini-api/docs/function-calling) | ||
| - `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/) |
Contributor
There was a problem hiding this comment.
P2: Documentation for operation.cost should clarify that the attribute is only present when genai-prices is installed and the model pricing is known, since the implementation silently skips cost calculation otherwise.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/integrations/llms/google-genai.md, line 59:
<comment>Documentation for `operation.cost` should clarify that the attribute is only present when `genai-prices` is installed and the model pricing is known, since the implementation silently skips cost calculation otherwise.</comment>
<file context>
@@ -53,10 +53,13 @@ following attributes may appear depending on the response:
+- `gen_ai.usage.cache_read.input_tokens` — tokens served from [context cache](https://ai.google.dev/gemini-api/docs/caching) (cache hit)
+- `gen_ai.usage.details.thoughts_tokens` — [reasoning tokens](https://ai.google.dev/gemini-api/docs/thinking) (Gemini 2.5 / 3.x)
+- `gen_ai.usage.details.tool_use_prompt_tokens` — tokens used for [tool definitions](https://ai.google.dev/gemini-api/docs/function-calling)
+- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/)
Note that, unlike Anthropic, the Gemini API's `prompt_token_count` already includes
</file context>
Suggested change
| - `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/) | |
| - `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/) (only present when the package is installed and model pricing is known) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_GenerateContentInstrumentationHelperto extractcached_content_token_count,thoughts_token_countandtool_use_prompt_token_countfromresponse.usage_metadataand emit them asgen_ai.usage.cache_read.input_tokens,gen_ai.usage.details.thoughts_tokensandgen_ai.usage.details.tool_use_prompt_tokens.operation.costviagenai-priceswhen available (silent failure if the package is missing or the model is unknown).google.genaiusers (not going throughpydantic-ai) were missing cache and thinking metrics in their spans.Implementation notes
_maybe_update_token_counts,create_final_attributes), following the same pattern already used elsewhere inlogfire/_internal/integrations/google_genai.py. Wrapped intry/exceptat module load so a future upstream rename keeps the base instrumentor working.opentelemetry-instrumentation-google-genai0.7b0; current pinning inpyproject.tomlis>= 0.4b0.None/0, so partial chunks don't overwrite a final value.prompt_token_countalready includes cached tokens, so we exposecache_readseparately rather than summing.genai-pricesis invoked withresponse.model_dump(by_alias=True)because the extractor expects camelCase JSON keys (usageMetadata,modelVersion).Test plan
uv run pytest tests/otel_integrations/test_google_genai.py— 8 passing (3 new + 5 existing; existing VCR snapshots updated to include the new attributes that real Gemini responses already carry).make lintmake typecheck