Skip to content

Support tool callbacks in MCP sampling#2998

Draft
EronWright wants to merge 2 commits into
docker:mainfrom
EronWright:sampling-tools
Draft

Support tool callbacks in MCP sampling#2998
EronWright wants to merge 2 commits into
docker:mainfrom
EronWright:sampling-tools

Conversation

@EronWright
Copy link
Copy Markdown

@EronWright EronWright commented Jun 4, 2026

Summary

Closes the tool callbacks functional gap in MCP sampling support — a follow-up to #2815, addressing one of the remaining items from #2809.

When an MCP server includes a tools array in a sampling/createMessage request, the host now drives its model with those tools and returns any tool_use blocks back to the server as ToolUseContent. The server remains responsible for executing the tool and continuing the loop in a follow-up sampling request.

sequenceDiagram
     participant H as cagent
      participant S as MCP Server
      participant L as LLM

      activate H
      H->>+S: tools/call {name, arguments}

      note over S: needs LLM inference

      S->>+H: sampling/createMessage<br/>{messages, tools: [...]}
      H->>+L: chat completion
      L-->>-H: ToolUseContent<br/>stopReason: "toolUse"
      H-->>-S: CreateMessageResult<br/>{tool_use, stopReason: "toolUse"}

      note over S: executes tool locally

      S->>+H: sampling/createMessage<br/>{messages + tool_use + tool_result, tools: [...]}
      H->>+L: chat completion
      L-->>-H: TextContent<br/>stopReason: "endTurn"
      H-->>-S: CreateMessageResult<br/>{text, stopReason: "endTurn"}

      S-->>-H: tool result
      deactivate H
Loading

What's new

  • New SamplingWithToolsHandler type and SampleableWithTools interface — additive, parallel to the existing SamplingHandler / Sampleable. No breaking changes to the basic sampling path merged in feat(mcp): add sampling/createMessage support #2815.
  • MCP toolset wires both handler types. At Initialize, exactly one of the SDK's mutually exclusive ClientOptions.CreateMessage* fields is populated — prefer with-tools when registered, fall back to basic.
  • Capability handshake advertises sampling.tools so servers know the host can receive tool-enabled requests.
  • Runtime handler (pkg/runtime/sampling.go):
    • Converts V2 multi-block messages: text, image/audio, tool_use → assistant ToolCalls, tool_resultMessageRoleTool rows (parallel tool_results expand to multiple chat.Message rows).
    • Converts []*mcp.Tool[]tools.Tool with a no-op handler (the server, not the host, executes).
    • Drives model.CreateChatCompletionStream, aggregates streamed tool calls.
    • Builds result Content with TextContent + ToolUseContent blocks; stopReason: "toolUse" when tool calls are present.
  • New limits: maxSamplingTools=64, maxSamplingToolCalls=32.
  • End-to-end test (e2e/sampling_test.go): mounts an in-process gomcp.NewServer on an httptest server via StreamableHTTPHandler. The server exposes one tool (ask_with_calculator) whose handler drives a real sampling-with-tools loop against the connecting cagent. The Gemini side is recorded once and replayed on subsequent runs, so the test runs offline in CI.

Out of scope (separate gaps from #2809)

  • Human-in-the-loop approval UI
  • Model-preference hints

Test plan

  • Unit tests pass: `go test ./pkg/runtime/... ./pkg/tools/...`
  • `go build ./...` clean
  • `go vet ./...` clean
  • `task lint` (0 offenses)
  • `gofmt` clean on all changed files
  • End-to-end via `TestExec_Gemini_SamplingWithTools` (cassette replays offline; verified once against live Gemini):
    • Handshake includes `sampling.tools` capability (implicit — `ServerSession.CreateMessageWithTools` would refuse otherwise)
    • LLM receives the server-supplied tools
    • Response contains `ToolUseContent` with `stopReason: "toolUse"` when the model emits a tool_use
    • Follow-up sampling request with `tool_result` blocks is converted correctly and the loop terminates with `endTurn`

Adds a parallel SamplingWithToolsHandler alongside the existing
SamplingHandler so MCP servers can include a tools array in
sampling/createMessage requests. The host drives its model with those
tools and returns any tool_use blocks as ToolUseContent; the server
remains responsible for executing the tool and continuing the loop in a
follow-up sampling request.

The initialize handshake now advertises sampling.tools capability, and
the MCP toolset selects the appropriate go-sdk handler (basic vs.
with-tools) based on which handler is registered.
@aheritier aheritier added area/agent For work that has to do with the general agent loop/agentic features of the app area/tools For features/issues/fixes related to the usage of built-in and MCP tools area/mcp MCP protocol, MCP tool servers, integration kind/feat PR adds a new feature (maps to feat: commit prefix) labels Jun 4, 2026
Mounts an in-process gomcp.NewServer on an httptest server via
StreamableHTTPHandler. Its one tool, ask_with_calculator, runs a
sampling loop: sends sampling/createMessage with a calculator tool,
gets a tool_use back from the host LLM, "executes" the calculator,
sends a follow-up sampling request carrying the tool_result, and
returns the final text. The Gemini side is recorded once and replayed
on subsequent runs, so the test runs offline in CI.
@EronWright EronWright marked this pull request as ready for review June 7, 2026 00:56
@EronWright EronWright requested a review from a team as a code owner June 7, 2026 00:56
@aheritier aheritier added the area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) label Jun 7, 2026
@aheritier
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

Two medium-severity findings in the newly-added sampling-with-tools code. The core stream-aggregation logic, capability handshake, content building, and limits enforcement are all well-structured and correctly tested.

Comment thread pkg/runtime/sampling.go
}

var out []chat.Message
if text.Len() > 0 || len(parts) > 0 || (len(toolCalls) > 0 && role == chat.MessageRoleAssistant) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Silent message drop when tool_use blocks appear in a non-assistant role

samplingV2BlocksToMessages appends tool_use blocks to toolCalls, but only emits them into an output message when role == chat.MessageRoleAssistant:

if text.Len() > 0 || len(parts) > 0 || (len(toolCalls) > 0 && role == chat.MessageRoleAssistant) {

If a V2 message arrives with a non-assistant role (e.g., user) containing only tool_use blocks — text is empty, parts is empty, toolResults is empty — none of the output conditions are satisfied, so out stays empty and the function returns ([], nil). The entire message is silently dropped with no error, making the resulting chat history shorter than expected without any indication of why.

While the MCP spec places tool_use on assistant turns, malformed MCP servers (or future spec revisions) could trigger this path. Returning an error instead of silently discarding the content would surface the problem immediately.

Suggested fix: add a guard after the loop — if toolCalls is non-empty and role != MessageRoleAssistant, return an explicit error:

if len(toolCalls) > 0 && role != chat.MessageRoleAssistant {
    return nil, fmt.Errorf("tool_use blocks in non-assistant message (role=%s)", role)
}

Comment thread pkg/tools/mcp/remote.go
PromptListChangedHandler: promptChanged,
}
switch {
case c.samplingWithToolsHandler != nil:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Handler selection baked at Initialize time — reconnect may silently lose sampling-with-tools capability

The SDK CreateMessageWithToolsHandler (vs CreateMessageHandler) is chosen once when Initialize runs by reading c.samplingWithToolsHandler at that moment:

switch {
case c.samplingWithToolsHandler != nil:
    opts.CreateMessageWithToolsHandler = c.handleSamplingWithToolsRequest
case c.samplingHandler != nil:
    opts.CreateMessageHandler = c.handleSamplingRequest
}

In the normal startup path this is fine: configureToolsetHandlers (line 221 of loop.go) calls SetSamplingWithToolsHandler before getTools (line 223) triggers Initialize, so the handler is present when the switch executes.

The concern is the reconnect path: if Initialize is invoked again (e.g., after a dropped connection) without configureToolsetHandlers being re-called first, the stored handler fields still hold the right value, so this should be fine in practice. However, if a future refactor re-orders the reconnect sequence or resets the handler fields, the MCP session would silently fall back to no handler — the server's sampling/createMessage requests with tools would receive no response and time out.

Consider using an accessor closure (captured at startup time) instead of reading the field at Initialize, or adding a comment to Initialize noting the ordering dependency so future maintainers know not to call it before the handlers are set.

@aheritier aheritier removed the area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) label Jun 8, 2026
@EronWright EronWright marked this pull request as draft June 8, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent For work that has to do with the general agent loop/agentic features of the app area/mcp MCP protocol, MCP tool servers, integration area/tools For features/issues/fixes related to the usage of built-in and MCP tools kind/feat PR adds a new feature (maps to feat: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants