fix(route/nhentai): fix detail route image src extraction and bypass Cloudflare with Puppeteer by FlanChanXwO · Pull Request #22140 · DIYgod/RSSHub

FlanChanXwO · 2026-05-31T14:23:33Z

Involved Issue / 该 PR 相关 Issue

Close # None

Example for the Proposed Route(s) / 路由地址示例

/nhentai/search/language:chinese+blue+archive/detail
/nhentai/index/parody/blue archive/detail

New RSS Route Checklist / 新 RSS 路由检查表

New Route / 新的路由
- Follows Script Standard / 跟随路由规范
Anti-bot or rate limit / 反爬/频率限制
- If yes, do your code reflect this sign? / 如果有, 是否有对应的措施?
Date and time / 日期和时间
- Parsed / 可以解析
- Correct time zone / 时区正确
New package added / 添加了新的包
Puppeteer

Note / 说明

This PR focuses on fixing the detail mode for existing /nhentai routes.

Fixed image parsing in detail mode: Updated getDetail() to correctly extract gallery images by supporting both data-src and src attributes, and improved high-quality image URL transformations.
Enabled Puppeteer: Set requirePuppeteer: true for nhentai routes to bypass Cloudflare anti-bot protection.
Refactored login flow: Replaced got-based login with Puppeteer to handle Cloudflare challenges.
Extended cookie cache: Increased login cookie cache duration from 3 days to 30 days.
Added maintainer: Added FlanChanXwO to the maintainer list.

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates the nhentai routes to rely on a browser automation flow for login/torrent fetching (likely to handle Cloudflare), and updates route metadata accordingly.

Changes:

Replace cookie acquisition and torrent download logic with a Puppeteer-driven approach (plus cookie refresh on expiry).
Make nhentai routes declare Puppeteer as required and add a new maintainer.
Improve image URL extraction robustness and adjust date parsing defaulting.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File	Description
lib/routes/nhentai/util.tsx	Switches cookie + torrent retrieval to browser automation, refines image extraction/date parsing
lib/routes/nhentai/search.ts	Declares Puppeteer requirement and updates maintainers
lib/routes/nhentai/index.ts	Declares Puppeteer requirement and updates maintainers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    const { page, destroy } = await getPuppeteerPage(loginUrl, {
+        onBeforeLoad: async (page) => {
+            const allowedTypes = new Set(['document', 'script', 'xhr', 'fetch', 'stylesheet']);
+            await page.setRequestInterception(true);
+            page.on('request', (request) => {
+                allowedTypes.has(request.resourceType()) ? request.continue() : request.abort();
+            });
        },
-        followRedirect: false,
+        gotoConfig: { waitUntil: 'domcontentloaded' },
    });


+        .map((ele) => {
+            const img = $(ele);
+            const src = img.attr('data-src') || img.attr('src');
+            return src ? new URL(src, baseUrl).href : null;
+        })
+        .filter((src) => src !== null)
+        .map((src) => src.replace(/(.+)(\d+)t\.(.+)/, (_, p1, p2, p3) => `${p1}${p2}.${p3}`))


        ...simple,
        title: $('div#info > h2').text() || $('div#info > h1').text(),
-        pubDate: parseDate($('time').attr('datetime')),
+        pubDate: parseDate($('time').attr('datetime') || ''),


+        await new Promise((resolve) => setTimeout(resolve, 5000));
+
+        let currentUrl = page.url();
+        let title = await page.title();
+
+        let attempts = 0;
+        // eslint-disable-next-line no-await-in-loop
+        while ((title.includes('Just a moment') || currentUrl.includes('challenges.cloudflare')) && attempts < 10) {
+            // eslint-disable-next-line no-await-in-loop
+            await new Promise((resolve) => setTimeout(resolve, 3000));
+            currentUrl = page.url();
+            // eslint-disable-next-line no-await-in-loop
+            title = await page.title();
+            attempts++;
+        }


+        requirePuppeteer: true,
        antiCrawler: true,
        supportBT: true,


+        requirePuppeteer: true,
        antiCrawler: true,
        supportBT: true,


github-actions · 2026-05-31T14:28:30Z

Successfully generated as following:

http://localhost:1200/nhentai/search/language:chinese+blue+archive/detail - Failed ❌

HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>Error: this route is empty, please check the original site or &lt;a href=&quot;https://github.com/DIYgod/RSSHub/issues/new/choose&quot;&gt;create an issue&lt;/a&gt;
Route: /nhentai/search/:keyword/:mode?
Full Route: /nhentai/search/language:chinese+blue+archive/detail
Node Version: v24.16.0
Git Hash: f16b1db4

http://localhost:1200/nhentai/index/parody/blue archive/detail - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>nhentai - parody - blue archive</title>
    <link>https://nhentai.net/parody/blue-archive/</link>
    <atom:link href="http://localhost:1200/nhentai/index/parody/blue%20archive/detail" rel="self" type="application/rss+xml"></atom:link>
    <description>hentai - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sun, 31 May 2026 14:28:29 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title></title>
      <description>&lt;h1&gt;0 pages&lt;/h1&gt;&lt;br&gt;</description>
      <link>https://nhentai.net/g/653594/</link>
      <guid isPermaLink="false">https://nhentai.net/g/653594/</guid>
      <pubDate>Invalid Date</pubDate>
    </item>
    <item>
      <title></title>
      <description>&lt;h1&gt;0 pages&lt;/h1&gt;&lt;br&gt;</description>
      <link>https://nhentai.net/g/653510/</link>
      <guid isPermaLink="false">https://nhentai.net/g/653510/</guid>
      <pubDate>Invalid Date</pubDate>
    </item>
    <item>
      <title></title>
      <description>&lt;h1&gt;0 pages&lt;/h1&gt;&lt;br&gt;</description>
      <link>https://nhentai.net/g/653493/</link>
      <guid isPermaLink="false">https://nhentai.net/g/653493/</guid>
      <pubDate>Invalid Date</pubDate>
    </item>
    <item>
      <title></title>
      <description>&lt;h1&gt;0 pages&lt;/h1&gt;&lt;br&gt;</description>
      <link>https://nhentai.net/g/653482/</link>
      <guid isPermaLink="false">https://nhentai.net/g/653482/</guid>
      <pubDate>Invalid Date</pubDate>
    </item>
    <item>
      <title></title>
      <description>&lt;h1&gt;0 pages&lt;/h1&gt;&lt;br&gt;</description>
      <link>https://nhentai.net/g/653455/</link>
      <guid isPermaLink="false">https://nhentai.net/g/653455/</guid>
      <pubDate>Invalid Date</pubDate>
    </item>
  </channel>
</rss>

github-actions · 2026-05-31T14:28:42Z

Auto Review

No clear rule violations found in the current diff.

FlanChanXwO and others added 8 commits April 26, 2026 15:42

feat(route/nhentai): enable Puppeteer, and update maintainers

46eaa5b

Update lib/routes/nhentai/util.tsx

78a3965

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

feat(route/nhentai): enable Puppeteer, and update maintainers (#4)

ed01364

refactor(config): remove unused NHentai access tokens

3437ed1

fix(route/nhentai): improve error handling for Cloudflare access denial

98cda7c

Merge branch 'develop' into route/nhentai

b5e7328

refactor(config): remove unused NHentai access tokens

63e4b1e

Merge branch 'master' into route/nhentai

42b9a5f

Copilot AI review requested due to automatic review settings May 31, 2026 14:23

github-actions Bot added the route label May 31, 2026

FlanChanXwO changed the title ~~Route/nhentai~~ fix(route/nhentai): fix detail route image src extraction and bypass Cloudflare with Puppeteer May 31, 2026

Copilot AI reviewed May 31, 2026

View reviewed changes

github-actions Bot added the auto: not ready to review Users can't get the RSS feed output according to automated testing results label May 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(route/nhentai): fix detail route image src extraction and bypass Cloudflare with Puppeteer#22140

fix(route/nhentai): fix detail route image src extraction and bypass Cloudflare with Puppeteer#22140
FlanChanXwO wants to merge 8 commits into
DIYgod:masterfrom
FlanChanXwO:route/nhentai

FlanChanXwO commented May 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FlanChanXwO commented May 31, 2026

Involved Issue / 该 PR 相关 Issue

Example for the Proposed Route(s) / 路由地址示例

New RSS Route Checklist / 新 RSS 路由检查表

Note / 说明

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Auto Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants