风险分析层 by asJEI · Pull Request #4 · asJEI/PR-Review

asJEI · 2026-05-29T04:13:45Z

新增可插拔的 RiskDetector 框架，从语义化 diff 数据生成高置信度工程风险提示。
包含 6 类规则检测器（auth、数据库、缓存、async、错误处理、并发）及置信度过滤机制。
可以通过：
npm run test
进行快速测试
通过：
cd "d:\CS Tool\Project\Github\PR-Review"
（首先进入项目根目录）
node scripts/build.mjs
（使用 Node.js 执行 scripts/build.mjs 文件）
node packages/context-builder/scripts/smoke.mjs "zhayujie/CowAgent#2845"
（执行冒烟测试，可替换为其他 PR 的 URL）

gemini-code-assist

Code Review

This pull request introduces a rule-based engineering risk analysis system within packages/diff-parser/src/risk/ to detect risks related to authentication, database operations, caching, async changes, error handling, and concurrency. This analysis is integrated into the context builder pipeline to enhance PR summaries with risk hints. The review feedback highlights critical improvements for the detectors to prevent false positives and false negatives, specifically: using word boundaries in pathContainsAny and ErrorHandlingDetector to avoid substring mismatches (e.g., "db" matching "dashboard" or "try" matching "history"), fixing a logical flaw in the error handling re-addition check, and expanding the async detection regex to support async arrow functions.

gemini-code-assist · 2026-05-29T04:15:05Z

+export function pathContainsAny(path: string, keywords: readonly string[]): string | null {
+  const lower = path.toLowerCase();
+
+  for (const keyword of keywords) {
+    if (lower.includes(keyword)) {
+      return keyword;
+    }
+  }
+
+  return null;
+}


在 pathContainsAny 中，使用 lower.includes(keyword) 进行简单的子字符串匹配会导致严重的误报。
例如：

当 DATABASE_PATH_KEYWORDS 包含 "db" 时，任何包含 "dashboard"、"sandbox"、"feedback" 或 "development" 的路径（如 src/dashboard/index.tsx）都会被误判为数据库相关路径。

当 AUTH_PATH_KEYWORDS 包含 "auth" 时，任何包含 "author" 的路径（如 src/author/service.ts）都会被误判为鉴权相关路径。

建议使用正则表达式配合单词边界 (\b) 进行匹配，确保只匹配独立的路径段或单词，避免子字符串引起的误报。

Suggested change

export function pathContainsAny(path: string, keywords: readonly string[]): string | null {

const lower = path.toLowerCase();

for (const keyword of keywords) {

if (lower.includes(keyword)) {

return keyword;

}

}

return null;

}

export function pathContainsAny(path: string, keywords: readonly string[]): string | null {

const lower = path.toLowerCase();

for (const keyword of keywords) {

const regex = new RegExp('\\\\b' + keyword + '\\\\b', 'i');

if (regex.test(lower)) {

return keyword;

}

}

return null;

}

gemini-code-assist · 2026-05-29T04:15:05Z

+      for (const line of deleted) {
+        const keyword = textContainsAny(line.content, ERROR_HANDLING_KEYWORDS);
+
+        if (!keyword) {
+          continue;
+        }
+
+        const reAdded = added.some((addLine: DiffLine) =>
+          textContainsAny(addLine.content, [keyword]),
+        );
+
+        if (!reAdded) {
+          evidence.push(`removed ${keyword}: ${line.content.trim()}`);
+        }
+      }


在 ErrorHandlingDetector 中，使用 textContainsAny 匹配 ERROR_HANDLING_KEYWORDS（如 "try", "catch", "throw", "finally"）存在两个严重问题：

子字符串误报：includes 匹配会导致非关键字的单词被误判。例如，删除包含 "history"、"country"、"poetry"、"retry" 的行时，因为它们包含 "try"，会被误判为“删除了错误处理逻辑”。

同 Hunk 内误判：如果同一个 hunk 内删除了一个 try-catch，但在不相关的地方新增了另一个 try-catch，reAdded 会因为 added.some 匹配到新增的关键字而返回 true，从而导致原本被删除的错误处理逻辑被漏报。

建议使用正则表达式配合单词边界 (\b) 来精确匹配错误处理关键字，并优化重添加的判断逻辑。

Suggested change

for (const line of deleted) {

const keyword = textContainsAny(line.content, ERROR_HANDLING_KEYWORDS);

if (!keyword) {

continue;

}

const reAdded = added.some((addLine: DiffLine) =>

textContainsAny(addLine.content, [keyword]),

);

if (!reAdded) {

evidence.push(`removed ${keyword}: ${line.content.trim()}`);

}

}

for (const line of deleted) {

const keywordRegex = /\\b(try|catch|throw|finally)\\b/i;

const match = keywordRegex.exec(line.content);

if (!match) {

continue;

}

const keyword = match[1].toLowerCase();

const reAdded = added.some((addLine: DiffLine) => {

const addMatch = keywordRegex.exec(addLine.content);

return addMatch && addMatch[1].toLowerCase() === keyword;

});

if (!reAdded) {

evidence.push(`removed ${keyword}: ${line.content.trim()}`);

}

}

gemini-code-assist · 2026-05-29T04:15:05Z

+        (/\basync\s+function\b/.test(line.content) ||
+          /\basync\s+def\b/.test(line.content) ||
+          /\basync\s+\w+\s*\(/.test(line.content)),


在检测新增的异步行为时，当前的正则表达式无法匹配到非常常见的 异步箭头函数（例如 async () => {} 或 async x => {}），因为 \basync\s+\w+\s*\( 要求 async 后面必须紧跟一个单词字符和括号。

建议优化正则表达式，以完整支持 JavaScript/TypeScript 中的异步箭头函数、类异步方法以及 Python 中的 async def。

Suggested change

(/\basync\s+function\b/.test(line.content) ||

/\basync\s+def\b/.test(line.content) ||

/\basync\s+\w+\s*\(/.test(line.content)),

(/\basync\s+(?:function|def)\b/.test(line.content) ||

/\basync\s*(?:\w+\s*)?\(/.test(line.content) ||

/\basync\s+\w+\s*=>/.test(line.content)),

风险分析层

7a8e1a0

gemini-code-assist Bot reviewed May 29, 2026

View reviewed changes

asJEI merged commit 7a8e1a0 into main May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

风险分析层#4

Hidden character warning

风险分析层#4
asJEI merged 1 commit into
mainfrom
添加基于规则的工程风险分析层

asJEI commented May 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

asJEI commented May 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant