feat(cnki): add detail extraction and advanced search#1855
Open
spring-peach wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds richer CNKI support by introducing a dedicated detail command and expanding the search command with advanced query options plus optional per-result detail extraction.
Changes:
- Added
cnki/detailcommand and shared helpers for URL normalization, search URL building, and detail-page extraction. - Reworked
cnki/searchto support advanced search expression, field/date/type filters, paging, and optional abstract extraction. - Expanded Vitest coverage for command registration and key validation/normalization behaviors; updated CLI manifest accordingly.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| clis/cnki/shared.js | New shared URL helpers and in-browser detail-page extractor; adds extractCnkiDetail. |
| clis/cnki/search.js | Major rewrite of CNKI search command: advanced options, pagination handling, optional detail scraping. |
| clis/cnki/detail.js | New CNKI “detail” CLI command built on shared extraction logic. |
| clis/cnki/search.test.js | Adds tests for new shared helpers and new detail command; expands search validations. |
| cli-manifest.json | Registers new cnki/detail command and updates cnki/search metadata/args/columns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+81
to
+86
| const stopLabels = [ | ||
| '\\u6458\\u8981', '\\u5173\\u952e\\u8bcd', '\\u4e13\\u8f91', | ||
| '\\u4e13\\u9898', '\\u5206\\u7c7b\\u53f7', | ||
| '\\u5728\\u7ebf\\u516c\\u5f00\\u65f6\\u95f4', '\\u57fa\\u91d1', | ||
| '\\u4f5c\\u8005', '\\u6765\\u6e90', 'DOI', 'CLC', 'Fund' | ||
| ]; |
Comment on lines
+3
to
+10
| export function normalizeCnkiUrl(url) { | ||
| const raw = String(url || '').trim(); | ||
| if (!raw) return ''; | ||
| if (/^https?:\/\//i.test(raw)) return raw; | ||
| if (raw.startsWith('//')) return `https:${raw}`; | ||
| if (raw.startsWith('/')) return `https://kns.cnki.net${raw}`; | ||
| return `https://kns.cnki.net/${raw}`; | ||
| } |
Comment on lines
+5
to
+9
| function parseLimit(value, fallback = 10) { | ||
| const parsed = Number.parseInt(String(value ?? ''), 10); | ||
| if (Number.isNaN(parsed)) return fallback; | ||
| return Math.max(0, parsed); | ||
| } |
Comment on lines
+314
to
+316
| if (results.length === before) break; | ||
| break; | ||
| } |
Comment on lines
+26
to
+71
| function parseDocTypes(value) { | ||
| const map = { | ||
| all: '', | ||
| journal: 'YSTT4HG0', | ||
| journals: 'YSTT4HG0', | ||
| dissertation: 'LSTPFY1C', | ||
| dissertations: 'LSTPFY1C', | ||
| thesis: 'LSTPFY1C', | ||
| degree: 'LSTPFY1C', | ||
| conference: 'JUP3MUPD', | ||
| conferences: 'JUP3MUPD', | ||
| newspaper: 'MPMFIG1A', | ||
| newspapers: 'MPMFIG1A', | ||
| book: 'EMRPGLPA', | ||
| books: 'EMRPGLPA', | ||
| standard: 'WQ0UVIAA', | ||
| standards: 'WQ0UVIAA', | ||
| achievement: 'BLZOG7CK', | ||
| achievements: 'BLZOG7CK', | ||
| patent: 'VUDIXAIY', | ||
| patents: 'VUDIXAIY', | ||
| yearbook: 'HHCPM1F8', | ||
| yearbooks: 'HHCPM1F8', | ||
| ccjd: 'PWFIRAGL', | ||
| special: 'NN3FJMUV', | ||
| video: 'NLBO1Z6R', | ||
| videos: 'NLBO1Z6R', | ||
| library: 'T2VC03OH', | ||
| ystt4hg0: 'YSTT4HG0', | ||
| lstpfy1c: 'LSTPFY1C', | ||
| jup3mupd: 'JUP3MUPD', | ||
| mpmfig1a: 'MPMFIG1A', | ||
| emrpglpa: 'EMRPGLPA', | ||
| wq0uviaa: 'WQ0UVIAA', | ||
| blzog7ck: 'BLZOG7CK', | ||
| vudixaiy: 'VUDIXAIY', | ||
| hhcpm1f8: 'HHCPM1F8', | ||
| pwfiragl: 'PWFIRAGL', | ||
| nn3fjmuv: 'NN3FJMUV', | ||
| nlbo1z6r: 'NLBO1Z6R', | ||
| t2vc03oh: 'T2VC03OH', | ||
| }; | ||
| const values = normalizeList(value); | ||
| if (values.length === 0 || values.includes('all')) return ''; | ||
| return Array.from(new Set(values.map(item => map[item] || item.toUpperCase()))).join(','); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tests