Skip to content

refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup#3370

Merged
WillemJiang merged 1 commit into
bytedance:mainfrom
ShenAC-SAC:refactor/tool-search-cleanup
Jun 5, 2026
Merged

refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup#3370
WillemJiang merged 1 commit into
bytedance:mainfrom
ShenAC-SAC:refactor/tool-search-cleanup

Conversation

@ShenAC-SAC
Copy link
Copy Markdown
Collaborator

Summary

Follow-up to #3342, which removed the ContextVar from deferred MCP tool loading (build-time closures + per-thread graph state). This is a maintainability + robustness pass over that module. It does not change the deferral mechanism or search ranking.

Why

tool_search is a cost-critical hot path: it keeps MCP tool schemas out of the per-turn tools array until the model promotes them, so it directly affects token usage. Code here should be held to a high bar, so this PR tightens a few rough edges left after #3342.

What changed

Single source of truth for the MCP tag

  • New leaf module deerflow/tools/mcp_metadata.py owns MCP_TOOL_METADATA_KEY, tag_mcp_tool(), and a public is_mcp_tool().
  • The "deerflow_mcp" string was hardcoded in both the loader (tools.py) and the reader (tool_search.py), and agent.py imported the private _is_mcp_tool across modules. A drift in one place would silently disable deferral (falling back to binding full MCP schemas). The key/tagger/predicate now live in one leaf module that any caller — including the loader — can import without an import cycle.

search() is now total over model input (robustness)

  • Search queries are model-generated, so they must never crash the tool.
  • Extracted _compile_catalog_regex (the compile-with-literal-fallback shared by the regex path and the + scorer).
  • Empty/whitespace queries return [] instead of letting the empty regex match every tool (which would flood the model with arbitrary tools).
  • A bare + ("+", " + ", "+ ") returns [] instead of raising IndexError. Empty routes through the tool's existing "No tools found" path, so the model gets a clean signal and re-queries — no exception, no error round-trip.

Readability

  • DeferredToolSetup documents its empty-vs-populated invariant.
  • build_deferred_tool_setup comments its two distinct empty-return branches.
  • _assemble_deferred gets a return-type annotation, a clearer local name, and an explicit append.

Testing

  • Full backend suite green: 3890 passed, 18 skipped.
  • Added catalog tests for empty and bare-+ queries.

Follow-up plan

This PR deliberately stops at maintainability + robustness and does not change search ranking, because ranking is a behavior change that should be data-driven:

  1. Measure first. Instrument the deferred-tool path via the existing token-usage tracking: per-turn tools-array token cost, how much is hidden pre-promotion, and how large the promoted set grows over a long thread.
  2. Then tune ranking precision (name vs description weighting; avoid description-frequency inflation) — only with a test pinning the desired order, since an over-eager promotion adds a permanent per-turn schema cost for the rest of a thread.
  3. Consider promotion eviction. Promotions only grow within a stable catalog; a long exploratory thread accumulates schemas monotonically. Whether to evict, and with what policy, should be decided from the measurements in step 1.

…ed-tool setup

Follow-up to bytedance#3342 (deferred MCP tool loading). Maintainability cleanup plus
hardening of malformed/empty tool_search queries; no change to the deferral
mechanism or search ranking.

- Add deerflow/tools/mcp_metadata.py as the single source of truth for the
  "deerflow_mcp" tag (MCP_TOOL_METADATA_KEY + tag_mcp_tool + public
  is_mcp_tool). Removes the duplicated magic string and the private,
  cross-module _is_mcp_tool import.
- tool_search.search: never raise on model-generated input. Extract
  _compile_catalog_regex (shared compile-with-literal-fallback); return empty
  for empty/whitespace queries and a bare "+" instead of matching everything
  or raising IndexError.
- DeferredToolSetup: document the empty-vs-populated invariant.
- build_deferred_tool_setup: comment the two distinct empty-return branches.
- _assemble_deferred: add return type, rename local to deferred_setup, build
  the final list with an explicit append.
- Tests: use tag_mcp_tool instead of per-file tag helpers; cover empty and
  bare-"+" queries.
@github-actions github-actions Bot added needs-validation Touches front/back contract surface; needs real-path validation risk:high High risk: backend API, agents, sandbox, auth, deps, CI size/M PR changes 100-300 lines area:backend Gateway / runtime / core backend under backend/ area:agents Agents, subagents, graph wiring, prompts, langgraph.json and removed size/M PR changes 100-300 lines risk:high High risk: backend API, agents, sandbox, auth, deps, CI needs-validation Touches front/back contract surface; needs real-path validation labels Jun 3, 2026
@ShenAC-SAC ShenAC-SAC requested a review from WillemJiang June 3, 2026 15:01
@WillemJiang WillemJiang requested a review from Copilot June 3, 2026 23:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors DeerFlow’s deferred MCP tool discovery path to reduce drift risk and improve robustness in tool_search, a cost-critical hot path that keeps MCP tool schemas out of the model context until explicitly promoted.

Changes:

  • Introduces a single, shared MCP metadata tagging API (tag_mcp_tool / is_mcp_tool) and updates loader/tests/agent assembly to use it.
  • Hardens deferred tool catalog search to be resilient to invalid regex input and to return no matches for empty/whitespace and bare + queries (instead of accidental broad matches or exceptions), with new targeted tests.
  • Improves readability/typing in deferred-tool setup/assembly (DeferredToolSetup invariant docs; clearer _assemble_deferred construction).

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
backend/tests/test_deferred_tool_crosscontext.py Updates MCP tagging in cross-context deferred-tool tests to use the new shared tagger.
backend/tests/test_deferred_setup.py Switches MCP tag/predicate tests to the public mcp_metadata helpers.
backend/tests/test_deferred_promotion_integration.py Updates integration test tagging to use tag_mcp_tool.
backend/tests/test_deferred_catalog.py Adds coverage for empty/whitespace and bare + queries in catalog search.
backend/packages/harness/deerflow/tools/tools.py Centralizes MCP tagging via tag_mcp_tool during MCP tool loading.
backend/packages/harness/deerflow/tools/mcp_metadata.py New leaf module defining the MCP metadata key, tagger, and predicate.
backend/packages/harness/deerflow/tools/builtins/tool_search.py Uses shared is_mcp_tool; adds regex compile helper; prevents empty/bare + query pitfalls; improves setup docs.
backend/packages/harness/deerflow/agents/lead_agent/agent.py Uses shared is_mcp_tool, adds typing for _assemble_deferred, and clarifies tool list assembly.

@ShenAC-SAC ShenAC-SAC added reviewing A maintainer is reviewing this PR and removed area:backend Gateway / runtime / core backend under backend/ area:agents Agents, subagents, graph wiring, prompts, langgraph.json labels Jun 4, 2026
@WillemJiang WillemJiang merged commit 2bbc787 into bytedance:main Jun 5, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reviewing A maintainer is reviewing this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants