Skip to content

feat(xiaohongshu): 评论输出 userId 与 profileUrl#1871

Open
huanghe wants to merge 1 commit into
jackwener:mainfrom
huanghe:pr/xhs-comment-userid
Open

feat(xiaohongshu): 评论输出 userId 与 profileUrl#1871
huanghe wants to merge 1 commit into
jackwener:mainfrom
huanghe:pr/xhs-comment-userid

Conversation

@huanghe
Copy link
Copy Markdown
Contributor

@huanghe huanghe commented Jun 6, 2026

动机

小红书评论命令之前只输出作者显示名(author)。显示名既不唯一也会变更,无法用来去重或回链到具体用户。本 PR 让每条评论/楼中楼回复额外输出规范化的 userId可点击的 profileUrl

改动

  • 从每个评论节点里抽取作者的 profile 链接(a[href*="/user/profile/"] 等若干选择器),拿到 authorHref
  • 新增 buildXhsProfileUrl(href):复用既有的 normalizeXhsUserId(clis/xiaohongshu/user-helpers.js)解析出 userId,生成 https://www.xiaohongshu.com/user/profile/<userId>;解析不出时返回空串(不抛错)。
  • 评论与楼中楼回复都带上 userId / profileUrl,并加入输出列。
  • cli-manifest.json 已用 npm run build-manifest 重新生成(仅新增这两列)。

兼容性

纯增量:只新增字段/列,未改动既有 author/text/likes 等字段语义。解析不到 profile 链接时两个新字段为空串,不影响原有行为。

测试

clis/xiaohongshu/comments.test.js:19 个用例通过(覆盖 href 抽取、userId 规范化、profileUrl 构造,以及缺失 href 的降级)。

@huanghe huanghe force-pushed the pr/xhs-comment-userid branch from 32b16b2 to dedac26 Compare June 6, 2026 08:08
@huanghe
Copy link
Copy Markdown
Contributor Author

huanghe commented Jun 6, 2026

已修复 CI 的 check:silent-column-drop 关卡:提取阶段携带的中间字段原名 authorHref 会被静态审计判为「未声明为列的静默丢弃」。按仓库既有约定改名为 authorHrefRaw(Raw 后缀)并让 enrich 以 arrow-return 形式在 userId/profileUrl 列里引用它,审计即识别为 transformed-intermediate、不再误报。authorHrefRaw 仍只是传输用中间字段,不进入最终行(测试已断言不泄漏)。

本地:check:silent-column-drop → new=0 OK;comments.test.js 19 passed;tsc --noEmit 通过。

Comment rows only exposed the author display name, which is ambiguous (names
are not unique and change). Extract the author's profile href from each comment
/ reply node and emit a normalized `userId` plus a canonical `profileUrl`, so
callers can deduplicate and link back to commenters.

Adds buildXhsProfileUrl() (reuses normalizeXhsUserId from user-helpers) and two
columns to the comments adapter; manifest regenerated.
@huanghe huanghe force-pushed the pr/xhs-comment-userid branch from dedac26 to fe9faa2 Compare June 6, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant