refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup#3370
Merged
WillemJiang merged 1 commit intoJun 5, 2026
Merged
Conversation
…ed-tool setup Follow-up to bytedance#3342 (deferred MCP tool loading). Maintainability cleanup plus hardening of malformed/empty tool_search queries; no change to the deferral mechanism or search ranking. - Add deerflow/tools/mcp_metadata.py as the single source of truth for the "deerflow_mcp" tag (MCP_TOOL_METADATA_KEY + tag_mcp_tool + public is_mcp_tool). Removes the duplicated magic string and the private, cross-module _is_mcp_tool import. - tool_search.search: never raise on model-generated input. Extract _compile_catalog_regex (shared compile-with-literal-fallback); return empty for empty/whitespace queries and a bare "+" instead of matching everything or raising IndexError. - DeferredToolSetup: document the empty-vs-populated invariant. - build_deferred_tool_setup: comment the two distinct empty-return branches. - _assemble_deferred: add return type, rename local to deferred_setup, build the final list with an explicit append. - Tests: use tag_mcp_tool instead of per-file tag helpers; cover empty and bare-"+" queries.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors DeerFlow’s deferred MCP tool discovery path to reduce drift risk and improve robustness in tool_search, a cost-critical hot path that keeps MCP tool schemas out of the model context until explicitly promoted.
Changes:
- Introduces a single, shared MCP metadata tagging API (
tag_mcp_tool/is_mcp_tool) and updates loader/tests/agent assembly to use it. - Hardens deferred tool catalog search to be resilient to invalid regex input and to return no matches for empty/whitespace and bare
+queries (instead of accidental broad matches or exceptions), with new targeted tests. - Improves readability/typing in deferred-tool setup/assembly (
DeferredToolSetupinvariant docs; clearer_assemble_deferredconstruction).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| backend/tests/test_deferred_tool_crosscontext.py | Updates MCP tagging in cross-context deferred-tool tests to use the new shared tagger. |
| backend/tests/test_deferred_setup.py | Switches MCP tag/predicate tests to the public mcp_metadata helpers. |
| backend/tests/test_deferred_promotion_integration.py | Updates integration test tagging to use tag_mcp_tool. |
| backend/tests/test_deferred_catalog.py | Adds coverage for empty/whitespace and bare + queries in catalog search. |
| backend/packages/harness/deerflow/tools/tools.py | Centralizes MCP tagging via tag_mcp_tool during MCP tool loading. |
| backend/packages/harness/deerflow/tools/mcp_metadata.py | New leaf module defining the MCP metadata key, tagger, and predicate. |
| backend/packages/harness/deerflow/tools/builtins/tool_search.py | Uses shared is_mcp_tool; adds regex compile helper; prevents empty/bare + query pitfalls; improves setup docs. |
| backend/packages/harness/deerflow/agents/lead_agent/agent.py | Uses shared is_mcp_tool, adds typing for _assemble_deferred, and clarifies tool list assembly. |
WillemJiang
approved these changes
Jun 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #3342, which removed the ContextVar from deferred MCP tool loading (build-time closures + per-thread graph state). This is a maintainability + robustness pass over that module. It does not change the deferral mechanism or search ranking.
Why
tool_searchis a cost-critical hot path: it keeps MCP tool schemas out of the per-turntoolsarray until the model promotes them, so it directly affects token usage. Code here should be held to a high bar, so this PR tightens a few rough edges left after #3342.What changed
Single source of truth for the MCP tag
deerflow/tools/mcp_metadata.pyownsMCP_TOOL_METADATA_KEY,tag_mcp_tool(), and a publicis_mcp_tool()."deerflow_mcp"string was hardcoded in both the loader (tools.py) and the reader (tool_search.py), andagent.pyimported the private_is_mcp_toolacross modules. A drift in one place would silently disable deferral (falling back to binding full MCP schemas). The key/tagger/predicate now live in one leaf module that any caller — including the loader — can import without an import cycle.search()is now total over model input (robustness)_compile_catalog_regex(the compile-with-literal-fallback shared by the regex path and the+scorer).[]instead of letting the empty regex match every tool (which would flood the model with arbitrary tools).+("+"," + ","+ ") returns[]instead of raisingIndexError. Empty routes through the tool's existing "No tools found" path, so the model gets a clean signal and re-queries — no exception, no error round-trip.Readability
DeferredToolSetupdocuments its empty-vs-populated invariant.build_deferred_tool_setupcomments its two distinct empty-return branches._assemble_deferredgets a return-type annotation, a clearer local name, and an explicit append.Testing
3890 passed, 18 skipped.+queries.Follow-up plan
This PR deliberately stops at maintainability + robustness and does not change search ranking, because ranking is a behavior change that should be data-driven:
tools-array token cost, how much is hidden pre-promotion, and how large the promoted set grows over a long thread.