Skip to content

Optimize GitHub UserNode#5041

Merged
arkid15r merged 5 commits into
OWASP:feature/graphql-dataloadersfrom
ahmedxgouda:dataloaders/github-user
Jun 29, 2026
Merged

Optimize GitHub UserNode#5041
arkid15r merged 5 commits into
OWASP:feature/graphql-dataloadersfrom
ahmedxgouda:dataloaders/github-user

Conversation

@ahmedxgouda

@ahmedxgouda ahmedxgouda commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Proposed change

Resolves #4600

  • Added dataloaders for release_count, issues_count, and badges
  • Optimized the field author on the IssueNode
GitHub.UserNode.mp4

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

@ahmedxgouda ahmedxgouda added the gsoc2026:ahmedxgouda ahmedxgouda's GSoC 2026 related work label Jun 24, 2026
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Adds user dataloaders, registers them in the GitHub dataloader map, changes UserNode resolvers to use request-scoped loading, and adjusts IssueNode relation loading configuration. Corresponding unit tests are updated and expanded.

Changes

GitHub user dataloaders and fields

Layer / File(s) Summary
User loader factory
backend/apps/github/api/internal/dataloaders/__init__.py, backend/apps/github/api/internal/dataloaders/user.py, backend/tests/unit/apps/github/api/internal/dataloaders/user_test.py
Adds user loader constants and batch functions, merges the user loader map into get_github_dataloaders(), and tests badge grouping, count defaults, and loader construction.
UserNode dataloader resolvers
backend/apps/github/api/internal/nodes/user.py, backend/tests/unit/apps/github/api/internal/nodes/user_test.py
Changes badges, issues_count, and releases_count to async resolver paths backed by github_dataloaders, with tests updated to mock the loader calls.
IssueNode field loading
backend/apps/github/api/internal/nodes/issue.py
Updates author and organization-related field loading to use the revised select_related configuration.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • OWASP/Nest#4609: Also extends get_github_dataloaders() with another request-scoped GitHub loader mapping.
  • OWASP/Nest#5022: Modifies the same GitHub dataloader factory and loader registration path.

Suggested reviewers

  • kasya
  • arkid15r
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning IssueNode select_related changes are unrelated to the UserNode optimization goals and appear out of scope. Remove the IssueNode field configuration changes unless they are required by #4600 and document that dependency explicitly.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: optimizing GitHub UserNode.
Linked Issues check ✅ Passed The PR adds the user-badge loader and count loaders used for issues_count and releases_count, matching #4600.
Docstring Coverage ✅ Passed Docstring coverage is 99.03% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The description matches the changeset by mentioning user dataloaders and the IssueNode author optimization.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.74%. Comparing base (e88e334) to head (07a64c3).
⚠️ Report is 1 commits behind head on feature/graphql-dataloaders.

Additional details and impacted files

Impacted file tree graph

@@                     Coverage Diff                      @@
##           feature/graphql-dataloaders    #5041   +/-   ##
============================================================
  Coverage                        98.74%   98.74%           
============================================================
  Files                              543      544    +1     
  Lines                            17138    17160   +22     
  Branches                          2425     2425           
============================================================
+ Hits                             16923    16945   +22     
  Misses                             123      123           
  Partials                            92       92           
Flag Coverage Δ
backend 99.45% <100.00%> (+<0.01%) ⬆️
frontend 96.71% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ckend/apps/github/api/internal/dataloaders/user.py 100.00% <100.00%> (ø)
backend/apps/github/api/internal/nodes/issue.py 100.00% <100.00%> (ø)
backend/apps/github/api/internal/nodes/user.py 100.00% <100.00%> (ø)

Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71cbe7b...07a64c3. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/apps/github/api/internal/nodes/user.py (1)

85-102: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Add distinct=True to both count annotations.

Line 85 and Line 99 can return inflated values when a query asks for issues_count and releases_count together. Django will join both reverse relations on the same User queryset, so the counts multiply each other (for example, 2 issues × 3 releases becomes 6/6).

Suggested fix
-    `@strawberry_django.field`(annotate={"issues_count": Count("created_issues")})
+    `@strawberry_django.field`(
+        annotate={"issues_count": Count("created_issues", distinct=True)}
+    )
     def issues_count(self, root: User) -> int:
         """Resolve issues count."""
         return root.issues_count
@@
-    `@strawberry_django.field`(annotate={"releases_count": Count("created_releases")})
+    `@strawberry_django.field`(
+        annotate={"releases_count": Count("created_releases", distinct=True)}
+    )
     def releases_count(self, root: User) -> int:
         """Resolve releases count."""
         return root.releases_count
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/apps/github/api/internal/nodes/user.py` around lines 85 - 102, Add
distinct counting to the annotated fields in the User node so `issues_count` and
`releases_count` do not multiply when both are requested together. Update the
`strawberry_django.field` annotations on `issues_count` and `releases_count` in
`User` to use `Count(..., distinct=True)` for the `created_issues` and
`created_releases` relations, keeping the existing resolver methods unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/apps/github/api/internal/dataloaders/release.py`:
- Around line 12-20: The release URL mapping in the dataloader duplicates the
URL construction logic instead of reusing the existing Release.url property.
Update the async loop in the release dataloader to use release.url with the same
empty-string fallback, and keep the select_related("repository__owner") prefetch
so Repository.url can still resolve owner.login without extra queries.

---

Outside diff comments:
In `@backend/apps/github/api/internal/nodes/user.py`:
- Around line 85-102: Add distinct counting to the annotated fields in the User
node so `issues_count` and `releases_count` do not multiply when both are
requested together. Update the `strawberry_django.field` annotations on
`issues_count` and `releases_count` in `User` to use `Count(..., distinct=True)`
for the `created_issues` and `created_releases` relations, keeping the existing
resolver methods unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e5c217fb-f0df-4160-85f9-113a4ba8a338

📥 Commits

Reviewing files that changed from the base of the PR and between 7e5ae19 and 5df7421.

📒 Files selected for processing (18)
  • backend/apps/common/api/internal/dataloaders/utils.py
  • backend/apps/github/api/internal/dataloaders/__init__.py
  • backend/apps/github/api/internal/dataloaders/release.py
  • backend/apps/github/api/internal/dataloaders/repository.py
  • backend/apps/github/api/internal/dataloaders/user.py
  • backend/apps/github/api/internal/nodes/release.py
  • backend/apps/github/api/internal/nodes/user.py
  • backend/apps/mentorship/api/internal/dataloaders/__init__.py
  • backend/apps/mentorship/api/internal/dataloaders/interested_users.py
  • backend/settings/graphql_context.py
  • backend/tests/unit/apps/common/api/internal/dataloaders/utils_test.py
  • backend/tests/unit/apps/github/api/internal/dataloaders/__init__.py
  • backend/tests/unit/apps/github/api/internal/dataloaders/release_test.py
  • backend/tests/unit/apps/github/api/internal/dataloaders/repository_test.py
  • backend/tests/unit/apps/github/api/internal/dataloaders/user_test.py
  • backend/tests/unit/apps/github/api/internal/nodes/release_test.py
  • backend/tests/unit/apps/github/api/internal/nodes/user_test.py
  • backend/tests/unit/apps/mentorship/api/internal/dataloaders/interested_users_test.py

Comment thread backend/apps/github/api/internal/dataloaders/release.py

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 18 files

Confidence score: 4/5

  • In backend/apps/github/api/internal/dataloaders/release.py, using "" as a missing-value sentinel instead of None is inconsistent with sibling loaders and can cause subtle downstream truthiness/typing bugs in API consumers if merged as-is — align missing release/repository values to None (or document and enforce a single sentinel) before merging.
  • backend/tests/unit/apps/github/api/internal/dataloaders/repository_test.py does not cover the OWASP_ORGANIZATION_NAME prefix-removal mapping path, so a regression there could ship unnoticed and return incorrect project names to clients — add a focused unit test for that mapping behavior before merge.
  • In backend/apps/github/api/internal/dataloaders/release.py, rebuilding the release URL inline duplicates Release.url, which raises drift risk if URL rules change later and can lead to inconsistent links — switch to Release.url to keep URL generation single-sourced.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/github/api/internal/dataloaders/release.py">

<violation number="1" location="backend/apps/github/api/internal/dataloaders/release.py:17">
P3: The inline URL construction `f"{release.repository.url}/releases/tag/{release.tag_name}"` duplicates the logic already defined in `Release.url`. Since `select_related("repository__owner")` is already applied (which `Repository.url` needs to resolve `owner.login`), you can simplify this to `release.url` with the empty-string fallback. This keeps the URL format defined in one place.</violation>

<violation number="2" location="backend/apps/github/api/internal/dataloaders/release.py:19">
P2: DataLoader uses empty string sentinel for missing releases/repositories instead of None, inconsistent with sibling fields and codebase patterns</violation>
</file>

<file name="backend/tests/unit/apps/github/api/internal/dataloaders/repository_test.py">

<violation number="1" location="backend/tests/unit/apps/github/api/internal/dataloaders/repository_test.py:140">
P2: Tests do not cover the OWASP_ORGANIZATION_NAME prefix-removal behavior in project name mapping.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

mapping[release.pk] = (
f"{release.repository.url}/releases/tag/{release.tag_name}"
if release.repository
else ""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: DataLoader uses empty string sentinel for missing releases/repositories instead of None, inconsistent with sibling fields and codebase patterns

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/apps/github/api/internal/dataloaders/release.py, line 19:

<comment>DataLoader uses empty string sentinel for missing releases/repositories instead of None, inconsistent with sibling fields and codebase patterns</comment>

<file context>
@@ -0,0 +1,31 @@
+        mapping[release.pk] = (
+            f"{release.repository.url}/releases/tag/{release.tag_name}"
+            if release.repository
+            else ""
+        )
+
</file context>

"""Returns project names in order of release_ids."""
release_ids = [1, 2, 3]
project_1 = MagicMock()
project_1.name = "Project Alpha"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Tests do not cover the OWASP_ORGANIZATION_NAME prefix-removal behavior in project name mapping.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/tests/unit/apps/github/api/internal/dataloaders/repository_test.py, line 140:

<comment>Tests do not cover the OWASP_ORGANIZATION_NAME prefix-removal behavior in project name mapping.</comment>

<file context>
@@ -0,0 +1,249 @@
+        """Returns project names in order of release_ids."""
+        release_ids = [1, 2, 3]
+        project_1 = MagicMock()
+        project_1.name = "Project Alpha"
+        project_2 = MagicMock()
+        project_2.name = "Project Beta"
</file context>

mapping: dict[int, str] = {}
async for release in releases:
mapping[release.pk] = (
f"{release.repository.url}/releases/tag/{release.tag_name}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The inline URL construction f"{release.repository.url}/releases/tag/{release.tag_name}" duplicates the logic already defined in Release.url. Since select_related("repository__owner") is already applied (which Repository.url needs to resolve owner.login), you can simplify this to release.url with the empty-string fallback. This keeps the URL format defined in one place.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/apps/github/api/internal/dataloaders/release.py, line 17:

<comment>The inline URL construction `f"{release.repository.url}/releases/tag/{release.tag_name}"` duplicates the logic already defined in `Release.url`. Since `select_related("repository__owner")` is already applied (which `Repository.url` needs to resolve `owner.login`), you can simplify this to `release.url` with the empty-string fallback. This keeps the URL format defined in one place.</comment>

<file context>
@@ -0,0 +1,31 @@
+    mapping: dict[int, str] = {}
+    async for release in releases:
+        mapping[release.pk] = (
+            f"{release.repository.url}/releases/tag/{release.tag_name}"
+            if release.repository
+            else ""
</file context>

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 24, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/apps/github/api/internal/nodes/user.py (1)

85-102: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use distinct=True on both Count(...) annotations.

issues_count and releases_count are reverse FK counts, so selecting them together can multiply rows through the shared join and inflate both totals. distinct=True avoids the overcount.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/apps/github/api/internal/nodes/user.py` around lines 85 - 102, Add
distinct counting to both reverse relation annotations in the User resolver:
update the issues_count and releases_count fields in the User class so their
Count("created_issues") and Count("created_releases") annotations use
distinct=True. This fixes the row multiplication when both annotated counts are
selected together, while keeping the existing resolver methods unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@backend/apps/github/api/internal/nodes/user.py`:
- Around line 85-102: Add distinct counting to both reverse relation annotations
in the User resolver: update the issues_count and releases_count fields in the
User class so their Count("created_issues") and Count("created_releases")
annotations use distinct=True. This fixes the row multiplication when both
annotated counts are selected together, while keeping the existing resolver
methods unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 113f5d7a-71e1-4889-bb92-b39abb05662a

📥 Commits

Reviewing files that changed from the base of the PR and between 5df7421 and 5f44fca.

📒 Files selected for processing (5)
  • backend/apps/github/api/internal/dataloaders/__init__.py
  • backend/apps/github/api/internal/dataloaders/user.py
  • backend/apps/github/api/internal/nodes/user.py
  • backend/tests/unit/apps/github/api/internal/dataloaders/user_test.py
  • backend/tests/unit/apps/github/api/internal/nodes/user_test.py

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 27, 2026

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 5 files (changes from recent commits).

Requires human review: Auto-approval blocked by 3 unresolved issues from previous reviews.

Re-trigger cubic

@ahmedxgouda ahmedxgouda marked this pull request as ready for review June 27, 2026 07:12
@github-actions

Copy link
Copy Markdown

Contribution validation failed:

@arkid15r arkid15r merged commit b2cbe07 into OWASP:feature/graphql-dataloaders Jun 29, 2026
31 of 32 checks passed
@sonarqubecloud

Copy link
Copy Markdown

@ahmedxgouda ahmedxgouda deleted the dataloaders/github-user branch June 29, 2026 01:11
ahmedxgouda added a commit to ahmedxgouda/Nest that referenced this pull request Jun 29, 2026
* Add optimization hints

* Add dataloader

* Add and update tests

* Replace optimization hints with dataloaders and optimize author field in IssueNode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend backend-tests gsoc2026:ahmedxgouda ahmedxgouda's GSoC 2026 related work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants