fix(skills): surface offending line and quoting hint on SKILL.md YAML… by Huixin615 · Pull Request #3335 · bytedance/deer-flow

Huixin615 · 2026-06-01T08:46:55Z

Why

Trigger: Issue #3333 reports that a custom SKILL.md whose description contains an unquoted : (e.g. description: StarRun collector: progress, errors) is rejected by PyYAML with mapping values are not allowed here, and the skill is then silently dropped at load time. The author of the issue also asks whether the failure could be surfaced to users instead of being invisible.

Pain: The existing log only echoes PyYAML's raw message, so authors have to grep the file to find the offending line. This hits especially hard for LLM-authored SKILL.md, where unquoted values containing : are a recurring mistake — every occurrence makes a skill silently disappear with no actionable hint.

What changed

When a SKILL.md front-matter fails YAML parsing, the error log now contains:

the source line PyYAML pointed at (via problem_mark.line), so the author no longer has to open the file to find the offending row, and
a copy-pasteable quoting hint — only when the error is the well-known mapping values are not allowed scanner error on an unquoted value. Other YAML errors are left untouched to avoid misleading the author.

No semantics are guessed or rewritten — the skill is still rejected, the diagnostic is just useful now.

Before

Invalid YAML front-matter in /.../SKILL.md: mapping values are not allowed here
  in "<unicode string>", line 2, column 31

After

Invalid YAML front-matter in /.../SKILL.md: mapping values are not allowed here
  in "<unicode string>", line 2, column 31
  line 2: description: StarRun collector: progress, errors, tables out
  hint: values containing ":" must be quoted, e.g. description: "StarRun collector: progress, errors, tables out"

Surface area

Frontend UI — page / component / setting / interaction under frontend/
Backend API — endpoint / SSE event / request-response shape under backend/app
Agents / LangGraph — agent node, graph wiring, langgraph.json, or prompt change
Sandbox — docker/ or sandboxed execution
Skills — change under skills/ (parser-side diagnostic improvement)
Dependencies — new/upgraded entry in backend/pyproject.toml or frontend/package.json (say what it buys us)
Default behavior change — changes existing behavior without the user opting in (default model, default setting, data shape)
Docs / tests / CI only — no runtime behavior change

Screenshots / Recording

N/A — backend-only change, no UI surface.

Bug fix verification

Test that reproduces the bug: backend/tests/test_skills_parser.py::test_parse_unquoted_colon_value_logs_line_and_hint
Red on main / green on this branch? Yes — the test asserts the log contains both the offending source line and a description: "..." quoting hint; before this PR neither was present, so the test fails on main and passes here.
Companion tests pin the boundaries:
- test_parse_unrelated_yaml_error_omits_quoting_hint — the hint must NOT attach to unrelated YAML errors (no misleading suggestions).
- test_parse_valid_skill_emits_no_error_log — a valid SKILL.md must still produce zero error logs.

Validation

Formatting (CI gate):

cd backend && make format
# ruff: Found 1 error (1 fixed, 0 remaining). 2 files reformatted.

Targeted unit tests (in the Docker gateway container, equivalent to make test):

docker exec -w /app/backend deer-flow-gateway sh -c \
  'PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run pytest tests/test_skills_parser.py -v'
# 19 passed (15 pre-existing + 3 new)

Full backend test suite:

docker exec -w /app/backend deer-flow-gateway sh -c \
  'PYTHONPATH=. PYTHONIOENCODING=utf-8 PYTHONUTF8=1 uv run pytest tests/ -q --tb=short'
# 3763 collected, 0 failed

… errors When a SKILL.md front-matter fails to parse, the existing log only echoes PyYAML's raw message, leaving authors to grep the file for the offending line. This is especially painful for the very common LLM-authored mistake of an unquoted scalar containing ': ' (e.g. 'description: foo: bar'), which fails with 'mapping values are not allowed here' and silently drops the skill. Enrich the error log with: - the source line PyYAML pointed at via problem_mark - a targeted, copy-pasteable quoting hint when (and only when) the error is the well-known 'mapping values are not allowed' scanner error on an unquoted value The skill is still rejected (no semantics are guessed or rewritten); only the diagnostic is improved. Fixes bytedance#3333

fancyboi999 · 2026-06-01T09:17:20Z

Read through the diff and tried the repro. To be clear on framing: this isn't really a bug fix, it's making the error message friendlier, which is useful.

One thing in the description isn't quite accurate. PyYAML already prints the offending line itself — it comes with the error. Here's what you get on main with no changes:

mapping values are not allowed here
  in "<unicode string>", line 2, column 31:
    description: StarRun collector: progress, errors, tables out
                                  ^

So the source line was already there before this PR. That also means the test's assert "StarRun collector: progress" in combined passes on main too, so it isn't really exercising the new behavior — only the quoting-hint assertion is. The actually-new part is the hint, plus you print the full line where PyYAML cuts long ones off with .... Might be worth wording the description and test around that.

A few small things:

The line N you print is the line within the front-matter, not the file. The regex strips the leading ---, so description shows as line 2 when it's line 3 in the file. Not a big deal since you print the line content too, but the number is off by one.
The hint drops indentation — a nested author: foo: bar becomes author: "foo: bar", so pasting it back under metadata: loses the indent. Top-level keys are fine.
getattr(exc, "problem", "") is a tighter match than "mapping values are not allowed" in str(exc). Minor.
This doesn't cover the frontend part of the issue (point 4), so 建议: Skill 文档 YAML front-matter 解析失败（mapping values not allowed） #3333 isn't fully resolved by it.

Overall it's an improvement over dumping the raw PyYAML message. Mostly just fix the framing.

Per review on bytedance#3335: - Log the file line number (mark.line + 2) instead of the front-matter-internal line number, so authors land on the right row in their editor. - Use exc.problem == "mapping values are not allowed here" for a tighter match than substring-scanning str(exc). - Preserve the offending key's leading whitespace in the quoting hint so nested mappings stay nested when authors paste the fix back. - Rewrite the regression test to actually exercise the new behaviour: PyYAML's own message already echoes the offending line (and truncates it with "..."), so the old assertion passed on main. New assertions pin (a) the file-line number, (b) the full untruncated line, and (c) the copy-pasteable hint. - Add a guard test for nested-key indentation so the partition()/strip() shape cannot regress silently. Refs bytedance#3333, bytedance#3335

Copilot

Pull request overview

Improves SKILL.md YAML front-matter parse diagnostics so misquoted scalar values (notably unquoted : in description) no longer fail silently; the parser now logs the offending source line (file-line numbering) and conditionally adds a copy/paste quoting hint for the specific, common PyYAML scanner error.

Changes:

Add _format_yaml_error() to enrich YAML parse error logs with the offending source line and a targeted quoting hint.
Update parse_skill_file() to use the formatted error output instead of logging the raw PyYAML exception only.
Add regression tests to validate the new log output (line number, full line not truncated, hint boundaries, indentation preservation).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
backend/packages/harness/deerflow/skills/parser.py	Formats YAML parse errors to include offending line + optional quoting hint; updates error logging callsite.
backend/tests/test_skills_parser.py	Adds tests covering the new friendly YAML error reporting behavior and guardrails.

+            if value and value[0] not in {'"', "'", "|", ">", "[", "{"}:
+                escaped = value.replace('"', '\\"')
+                lines.append(f'  hint: values containing ":" must be quoted, e.g. {key}: "{escaped}"')


The hint emitted by _format_yaml_error previously escaped only double quotes, so values containing backslashes (e.g. Windows paths like C:\Temp or regex escapes like \d) produced a suggested scalar that was either invalid YAML or silently re-interpreted by PyYAML's double-quoted escape rules when pasted back. Escape order matters: backslashes first, then double quotes. Adds two regression tests covering Windows-path and regex-style backslashes. Address Copilot CR feedback on PR bytedance#3335.

Huixin615 · 2026-06-02T02:46:52Z

Hi @WillemJiang Thanks for the review! I've addressed Copilot's review comments, Could you please take another look when you have time, thanks a lot.

fancyboi999 mentioned this pull request Jun 1, 2026

建议: Skill 文档 YAML front-matter 解析失败（mapping values not allowed） #3333

Open

cvczxc approved these changes Jun 1, 2026

View reviewed changes

WillemJiang requested a review from Copilot June 2, 2026 02:03

Copilot started reviewing on behalf of WillemJiang June 2, 2026 02:04 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread backend/packages/harness/deerflow/skills/parser.py

Comment on lines +36 to +38

if value and value[0] not in {'"', "'", "|", ">", "[", "{"}:

escaped = value.replace('"', '\\"')

lines.append(f' hint: values containing ":" must be quoted, e.g. {key}: "{escaped}"')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skills): surface offending line and quoting hint on SKILL.md YAML…#3335

fix(skills): surface offending line and quoting hint on SKILL.md YAML…#3335
Huixin615 wants to merge 3 commits into
bytedance:mainfrom
Huixin615:feature/skill-yaml-frontmatter-friendly-error

Huixin615 commented Jun 1, 2026

Uh oh!

fancyboi999 commented Jun 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Huixin615 commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Huixin615 commented Jun 1, 2026

Why

What changed

Surface area

Screenshots / Recording

Bug fix verification

Validation

Uh oh!

fancyboi999 commented Jun 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Huixin615 commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants