Skip to content

Update OpenTelemetry GenAI conventions to latest#7589

Draft
jeffhandley wants to merge 1 commit into
dotnet:mainfrom
jeffhandley:otel-genai-reasoning-tool-duration
Draft

Update OpenTelemetry GenAI conventions to latest#7589
jeffhandley wants to merge 1 commit into
dotnet:mainfrom
jeffhandley:otel-genai-reasoning-tool-duration

Conversation

@jeffhandley

@jeffhandley jeffhandley commented Jun 26, 2026

Copy link
Copy Markdown
Member

Draft pending the first release of
open-telemetry/semantic-conventions-genai.
Both conventions are Development stability and unreleased. Merge once they ship in a tagged release.

What this PR implements

Area Convention Upstream Compensating change
gen-ai gen_ai.request.reasoning.level semantic-conventions-genai#258 Emit on chat spans from ChatOptions.Reasoning.Effort in OpenTelemetryChatClient (normalized lowercase token; exact provider wire string may differ, e.g. OpenAI xhigh).
gen-ai/agent gen_ai.execute_tool.duration #201 + #322 New histogram recorded in FunctionInvocationProcessor (gen_ai.tool.name, gen_ai.tool.type, error.type). Meter shared via GetService(typeof(Meter)) on the chat + realtime OTel clients.

Validation: build clean (net8.0/net9.0/net10.0, 0 warnings); 677 AI + 103 realtime tests pass. No public API surface change (constants/helpers internal); doc-comment version reference left at v1.41 (GenAI repo has no release to bump to yet).

Upstream scan tracking

Legend: 🔴 implemented here · ✅ already aligned · 🟡 watch/deferred · 🟢 not applicable (no client / docs-only)

Merged upstream changes (Unreleased) -- applicability to dotnet/extensions

Upstream PR Area Change Applicability Status
#258 gen-ai gen_ai.request.reasoning.level 🔴 Implemented
#201 gen-ai/agent execute_tool.duration + invoke_agent.duration 🔴 execute_tool.duration implemented; invoke_agent.duration N/A (no invoke_agent span)
#322 gen-ai/agent align invoke_agent.internal/execute_tool attrs+metrics 🔴 Informs execute_tool.duration tag set (implemented)
#217 gen-ai top_k double->int, split retrieval.top_k ChatOptions.TopK already int?
#257 gen-ai limit system_instructions to text parts Emits Instructions as a single text part
#219 gen-ai conversation.id no fallback UUIDs Uses real conversation id only
#142 gen-ai document modality in message schema 🟡 Watch (additive message serialization)
#162 gen-ai conversation.compacted + CompactionPart 🟡 Deferred (no compaction API)
#179 gen-ai prompt.version / prompt.variable 🟡 Deferred (no prompt-template API)
#211 gen-ai billed vs consumed token counts 🟡 Watch (clarification)
#214 gen-ai provider.name cond-required on operation.duration 🟡 Watch (we always set provider.name)
#212 gen-ai generalize provider.name description 🟢 Docs-only
#216 gen-ai span covers duration incl. retries 🟢 Docs-only
#99 gen-ai add moonshot_ai provider value 🟢 provider.name is passthrough
#126 gen-ai/agent workflow.duration metric 🟢 No invoke_workflow span emitted here
#97 gen-ai/agent plan operation 🟢 No plan/agent span
#107 gen-ai/agent agent.name sampling-relevant 🟢 No agent spans
#242 gen-ai/agent limit agent.id to stable ids 🟢 agent.id not emitted
#289 gen-ai/agent remove provider.name from internal agent spans 🟢 No internal agent span
#321 gen-ai/agent invoke_agent.duration in-proc clarification 🟢 No invoke_agent span
#140 gen-ai memory operation spans/attrs 🟢 No memory instrumentation
#136 mcp tool.call.arguments/result opt-in 🟢 No MCP instrumentation
#220 mcp context propagation (SEP-414) 🟢 No MCP instrumentation
#330 tooling complex-attr JSON serialization templates 🟢 Upstream authoring only

In-flight upstream changes (open PRs) -- applicability if merged

Filter: open PRs proposing convention changes (excludes currently-open pure dependency/CI/chore PRs: #112, #290, #328, #340; since the prior scan, chore #342 merged and #282 closed unmerged).

Upstream PR Area Change Applicability Status
#98 gen-ai/agent model A2A handoff as execute_tool span 🟡 Watch (we own execute_tool)
#215 gen-ai clarify client.operation.duration scope 🟡 Watch (we emit it)
#197 gen-ai modality/cache/phase token-usage breakdowns 🟡 Watch (token usage attrs)
#96 gen-ai token.cache / token.reasoning metric attrs 🟡 Watch (token usage)
#144 gen-ai input-messages BlobPart optional + stripped_reason 🟡 Watch (message content)
#143 gen-ai byte_size on multimodal content parts 🟡 Watch (message content)
#341 gen-ai/agent rename workflow.duration -> invoke_workflow.duration 🟢 N/A (not emitted here)
#164 gen-ai server.inter_token_latency metric 🟢 N/A (server-side)
#336, #350, #270, #267, #252, #250, #238, #291, #202, #165, #351, #325 gen-ai/agent agent entity / identity / finish_reason / invocation / server / authorization / threat / content-size + agentic reference scenarios 🟢 N/A (M.E.AI doesn't own agent/invoke_agent spans)
#185, #184 gen-ai/eval evaluation operation/span + response.id 🟢 N/A (no eval instrumentation)
#195, #188, #190, #262 gen-ai a2a protocol, workflow node, context-selection event, guardrail/security span 🟢 N/A (not instrumented here)
#283, #324 anthropic/aws-bedrock provider reference scenarios 🟢 N/A (other SDK repos)

Tracking state

Upstream-Repo: open-telemetry/semantic-conventions-genai
Upstream-Scan-Ref: e153ed94728993acd0ee6a958559032ca8b20afe  # since 377226a: only chore #342 (agents.md), no convention change
Upstream-Scan-Date: 2026-06-26T16:06:34Z
Upstream-Release: none            # Unreleased; Towncrier fragments under changelog.d/
Core-Semconv-Dependency: v1.42.0
DotnetExtensions-Implemented-Version: v1.41

…elemetry

- Emit gen_ai.request.reasoning.level on chat spans from ChatOptions.Reasoning.Effort
  in OpenTelemetryChatClient, mapping ReasoningEffort to a normalized lowercase token.
- Add the gen_ai.execute_tool.duration histogram: expose the Meter via GetService on
  OpenTelemetryChatClient and OpenTelemetryRealtimeClientSession; FunctionInvokingChatClient
  and FunctionInvokingRealtimeClientSession retrieve it and build the histogram;
  FunctionInvocationProcessor records the duration with gen_ai.tool.name, gen_ai.tool.type,
  and error.type.
- Add OpenTelemetryConsts.GenAI.Request.ReasoningLevel and GenAI.ExecuteTool.Duration
  constants plus the CreateGenAIExecuteToolDurationHistogram helper.
- Augment OpenTelemetryChatClientTests for reasoning.level and add the
  ExecuteToolDurationMetricRecorded test.

Implements open-telemetry/semantic-conventions-genai#258 (reasoning.level) and dotnet#201/dotnet#322
(execute_tool.duration), both unreleased / development stability.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dotnet-comment-bot

Copy link
Copy Markdown
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Diagnostics.Testing Line 99 98.65 🔻
Microsoft.Extensions.Telemetry Line 93 91.95 🔻
Microsoft.Extensions.AI Line 89 88.56 🔻
Microsoft.Extensions.AI Branch 89 88.65 🔻
Microsoft.Extensions.AI.OpenAI Line 75 62.89 🔻
Microsoft.Extensions.AI.OpenAI Branch 75 50.41 🔻
Microsoft.Extensions.DataIngestion.MarkItDown Line 75 4.46 🔻
Microsoft.Extensions.DataIngestion.MarkItDown Branch 75 0 🔻
Microsoft.Extensions.Diagnostics.ResourceMonitoring Line 99 96.03 🔻
Microsoft.Extensions.Diagnostics.ResourceMonitoring Branch 99 94.39 🔻
Microsoft.Extensions.Diagnostics.ResourceMonitoring.Kubernetes Line 99 97.73 🔻
Microsoft.Extensions.ServiceDiscovery.Dns Line 75 69.93 🔻
Microsoft.Extensions.ServiceDiscovery.Abstractions Line 75 42.11 🔻
Microsoft.Extensions.ServiceDiscovery.Abstractions Branch 75 42.86 🔻
Microsoft.Extensions.ServiceDiscovery Line 75 67.96 🔻
Microsoft.Extensions.ServiceDiscovery Branch 75 71.43 🔻
Microsoft.Extensions.ServiceDiscovery.Yarp Line 75 73.85 🔻
Microsoft.Extensions.ServiceDiscovery.Yarp Branch 75 70 🔻
Microsoft.Extensions.VectorData.Abstractions Line 75 37.39 🔻
Microsoft.Extensions.VectorData.Abstractions Branch 75 22.73 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Gen.BuildMetadata 97 100
Microsoft.Gen.MetadataExtractor 57 73
Microsoft.Gen.MetricsReports 67 69
Microsoft.Extensions.AI.Abstractions 82 85
Microsoft.Extensions.AI.Evaluation.NLP 0 78
Microsoft.Extensions.Caching.Hybrid 82 84
Microsoft.Extensions.DataIngestion 75 89
Microsoft.Extensions.DataIngestion.Markdig 75 90
Microsoft.Extensions.Http.Resilience 97 100

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1482921&view=codecoverage-tab

@jeffhandley jeffhandley added area-ai Microsoft.Extensions.AI libraries automation Issues and pull requests created through automation labels Jun 26, 2026
@jeffhandley jeffhandley changed the title Add gen_ai.request.reasoning.level and gen_ai.execute_tool.duration (DRAFT) [automation] Update open-telemetry/semantic-conventions-genai to v1.42.0 Jun 26, 2026
@jeffhandley jeffhandley changed the title [automation] Update open-telemetry/semantic-conventions-genai to v1.42.0 Update open-telemetry/semantic-conventions-genai to v1.42.0 Jun 26, 2026
@jeffhandley jeffhandley changed the title Update open-telemetry/semantic-conventions-genai to v1.42.0 Update OpenTelemetry GenAI conventions to latest Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-ai Microsoft.Extensions.AI libraries automation Issues and pull requests created through automation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants