fix: resolve race condition in file logging for /ai/log_correction endpoint by ArshVermaGit · Pull Request #2152 · ritesh-1918/HELPDESK.AI

ArshVermaGit · 2026-06-07T13:35:50Z

Summary
This PR fixes a critical data-corruption bug in the admin correction logging mechanism where concurrent requests to /ai/log_correction could mangle the corrections_log.json file.

Changes Made
Introduced a global asyncio.Lock() instance (corrections_log_lock) in backend/main.py.
Wrapped the read-modify-write logic for corrections_log.json inside an async with corrections_log_lock: block.
Ensures strict sequential file access during periods of high concurrency, completely eliminating the race condition.
Resolved Issue
Resolves #2151

Summary by CodeRabbit

New Features
- Ticket operations now require authentication via bearer token.
- Added pagination support (limit/offset) for ticket retrieval.
- Implemented rate limiting on ticket endpoints.
Bug Fixes
- Enhanced user identity validation to prevent ID spoofing.
Performance Improvements
- Migrated ticket persistence to cloud-based storage.
- Moved AI processing tasks to background execution for better responsiveness.

vercel · 2026-06-07T13:35:54Z

@ArshVermaGit is attempting to deploy a commit to the ritesh Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-06-07T13:36:00Z

📝 Walkthrough

Walkthrough

This PR migrates ticket storage from in-memory to Supabase-backed persistence, hardens endpoint authentication against user-id spoofing, adds concurrency safety to the corrections log, and offloads long-running AI operations to background threads to prevent event loop blocking.

Changes

Ticket Storage & Concurrency Hardening

Layer / File(s)	Summary
Supabase Authentication Configuration `backend/.env.example`, `backend/main.py`	Switched Supabase client initialization from `SUPABASE_SERVICE_KEY` to `SUPABASE_ANON_KEY`, updated environment example file to document the anon-key credential.
Ticket Persistence Supabase Migration `backend/main.py`	Removed in-memory `TICKETS_DB`; reworked `GET /tickets` to accept `limit`/`offset` pagination via Supabase `.range(...)`; updated `POST /tickets/save` to require Bearer token authorization and validate user identity via `supabase.auth.get_user()` to block user-id spoofing; replaced `POST /tickets` and `PATCH /tickets/{ticket_id}` in-memory implementations with Supabase insert/update operations. All ticket endpoints now accept `request: Request` for rate limiting. Updated initial system message `sender_role` from `"admin"` to `"system"`. Restricted CORS `allow_methods` and `allow_headers` to explicit allowlists.
Corrections Log Thread-Safe Serialization `backend/main.py`	Introduced module-level `asyncio.Lock` (`corrections_log_lock`) and wrapped the JSON load/append/write sequence in `/ai/log_correction` with `async with corrections_log_lock` to prevent concurrent file corruption.

AI Pipeline Threading

Layer / File(s)	Summary
Threading in /ai/analyze and /ai/analyze-v2 `backend/main.py`	Moved Gemini image analysis, classifier V3/V1 predictions, NER extraction, duplicate detection, and RAG search in read-only analysis and v2 endpoints from synchronous calls to `await asyncio.to_thread(...)` to prevent event loop blocking.
Threading in /ai/analyze_stream (SSE) `backend/main.py`	Moved Gemini image analysis, NER extraction, classifier predictions, duplicate detection, and RAG search in SSE streaming analysis flow to background threads via `asyncio.to_thread` to prevent blocking the event stream.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

ritesh-1918/HELPDESK.AI#1333: Overlaps at ticket endpoint authentication and user-id spoofing prevention via supabase.auth.get_user.
ritesh-1918/HELPDESK.AI#1132: Overlaps at GET /tickets pagination support via limit/offset parameters.
ritesh-1918/HELPDESK.AI#524: Overlaps at /ai/log_correction concurrency serialization using asyncio.Lock.

Suggested labels

type:bug, level:intermediate, level:critical, quality:clean

Poem

🐰 From memory's fleeting cache to Supabase's steady store,
Anon keys unlock the door, thread pools hum and soar!
Locks protect the correction stream, no more corruption's plight,
Your async pipelines dance with grace—the AI shines so bright! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title focuses narrowly on the race condition fix for log_correction, but the changeset includes extensive Supabase integration, endpoint rewrites, CORS changes, and thread-pool conversions that are not mentioned.	Expand the title to reflect the scope of changes, such as: 'refactor: migrate to Supabase with auth, rate limiting, and async thread pool execution' or split into multiple focused PRs.
Out of Scope Changes check	⚠️ Warning	The changeset includes extensive out-of-scope modifications: Supabase migration, endpoint authorization/validation overhauls, CORS restrictions, and thread-pool async conversions unrelated to the race condition fix in `#2151`.	Isolate the corrections_log_lock fix into a focused PR; move Supabase integration, auth changes, and async thread conversions into separate PRs with their own requirements.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	The PR addresses the primary objective from `#2151` by introducing asyncio.Lock() to serialize file access for corrections_log.json, eliminating the race condition through synchronized read-modify-write operations.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ArshVermaGit

Hi @ritesh-1918 ! The issue has been resolved. Please review the PR and merge it under GSSoC. Thanks!

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/main.py (1)
647-682: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Ticket creation and the initial system message are not atomic.

If the tickets insert succeeds and ticket_messages fails, this handler returns 500 even though the ticket row already exists. Because ticket inserts also trigger the notifier webhook, a client retry can create duplicate tickets and duplicate notifications. This needs a single DB transaction/RPC or an idempotent compensation strategy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/main.py` around lines 647 - 682, The ticket creation and the initial
system message must be performed atomically to avoid partial commits; replace
the separate supabase.table("tickets").insert(...) and
supabase.table("ticket_messages").insert(...) calls with a single database
transaction or server-side RPC that inserts the ticket (using final_data),
inserts the initial system message (message/msg, sender_id, sender_name,
sender_role) and returns the new ticket id in one operation, and only after the
transaction succeeds call duplicate_service.add_ticket(str(ticket_id),
duplicate_text); ensure the notifier webhook is invoked only after the
transaction completes (or move webhook firing into the same RPC) and preserve
the existing duplicate_index_warning/duplicate_indexed handling in the
non-transactional duplicate indexing path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/.env.example`:
- Around line 1-3: The .env.example is missing the Supabase service-role key
used by backend/services/auto_close_service.py; add a new entry named
SUPABASE_SERVICE_ROLE_KEY alongside SUPABASE_URL and SUPABASE_ANON_KEY so
environments include the service role secret required by the
auto-close/background job path and documents its purpose (e.g., indicate it's
the service role secret for server-side operations used by
auto_close_service.py).

In `@backend/main.py`:
- Around line 706-728: The create_ticket and update_ticket endpoints accept raw
dicts and allow unguarded writes; either remove these routes or harden them by
reusing the same auth helper used by /tickets/save to validate the bearer token,
user-id and tenant, then validate and sanitize the request body against a
whitelisted Pydantic model (only allowed fields like title, body, status, etc.),
explicitly strip or reject any server-controlled columns (id, created_at,
tenant_id, user_id, viewed_at), and only pass the sanitized payload to
supabase.table("tickets").insert(...) or .update(...). Ensure update_ticket also
verifies the ticket belongs to the authenticated user/tenant before applying
updates.
- Around line 38-47: The current shared supabase client (created via
create_client and stored in supabase) only validates tokens with
supabase.auth.get_user(token) but does not bind the caller JWT to subsequent
table(...).select/insert/update calls, so RLS won’t apply; fix by creating a
per-request Supabase Client or session using the incoming JWT (either build a
per-request Client with ClientOptions including headers Authorization: Bearer
<jwt> or call client.auth.set_session(...) with the token) and use that
per-request client instance for all profiles/tickets DB operations instead of
the global supabase object (referencing create_client, Client, supabase,
supabase.auth.get_user, table(...).select/insert/update, and auth.set_session).

---

Outside diff comments:
In `@backend/main.py`:
- Around line 647-682: The ticket creation and the initial system message must
be performed atomically to avoid partial commits; replace the separate
supabase.table("tickets").insert(...) and
supabase.table("ticket_messages").insert(...) calls with a single database
transaction or server-side RPC that inserts the ticket (using final_data),
inserts the initial system message (message/msg, sender_id, sender_name,
sender_role) and returns the new ticket id in one operation, and only after the
transaction succeeds call duplicate_service.add_ticket(str(ticket_id),
duplicate_text); ensure the notifier webhook is invoked only after the
transaction completes (or move webhook firing into the same RPC) and preserve
the existing duplicate_index_warning/duplicate_indexed handling in the
non-transactional duplicate indexing path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 628be068-a2ed-4d8c-9b14-d2e5fbeb741d

📥 Commits

Reviewing files that changed from the base of the PR and between da8faf2 and 1f54460.

📒 Files selected for processing (2)

backend/.env.example
backend/main.py

coderabbitai · 2026-06-07T13:43:33Z

 # Supabase Configuration
 SUPABASE_URL=https://your-project.supabase.co
-SUPABASE_SERVICE_KEY=your-service-key
+SUPABASE_ANON_KEY=your-anon-key


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the backend service-role secret documented here.

This template now only shows the anon key, but backend/services/auto_close_service.py still creates its own Supabase client from SUPABASE_SERVICE_ROLE_KEY. Fresh environments copied from this file will start the API and silently break the auto-close/background job path.

Suggested env template update

# Supabase Configuration SUPABASE_URL=https://your-project.supabase.co SUPABASE_ANON_KEY=your-anon-key +SUPABASE_SERVICE_ROLE_KEY=your-service-role-key

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Supabase Configuration

SUPABASE_URL=https://your-project.supabase.co

SUPABASE_SERVICE_KEY=your-service-key

SUPABASE_ANON_KEY=your-anon-key

# Supabase Configuration

SUPABASE_URL=https://your-project.supabase.co

SUPABASE_ANON_KEY=your-anon-key

SUPABASE_SERVICE_ROLE_KEY=your-service-role-key

🧰 Tools

🪛 dotenv-linter (4.0.0)

[warning] 3-3: [UnorderedKey] The SUPABASE_ANON_KEY key should go before the SUPABASE_URL key

(UnorderedKey)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/.env.example` around lines 1 - 3, The .env.example is missing the Supabase service-role key used by backend/services/auto_close_service.py; add a new entry named SUPABASE_SERVICE_ROLE_KEY alongside SUPABASE_URL and SUPABASE_ANON_KEY so environments include the service role secret required by the auto-close/background job path and documents its purpose (e.g., indicate it's the service role secret for server-side operations used by auto_close_service.py).

coderabbitai · 2026-06-07T13:43:33Z

+# Initialize Supabase Client (Anon Key to respect RLS)
 try:
    from supabase import create_client, Client
    url = os.environ.get("SUPABASE_URL")
-    key = os.environ.get("SUPABASE_SERVICE_KEY")
+    key = os.environ.get("SUPABASE_ANON_KEY")
    if not url or not key:
-        print("[ERROR] SUPABASE_URL or SUPABASE_SERVICE_KEY not set in backend/.env")
+        print("[ERROR] SUPABASE_URL or SUPABASE_ANON_KEY not set in backend/.env")
        supabase = None
    else:
        supabase = create_client(url, key)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For supabase-py 2.22.4, if a client is created once with create_client(SUPABASE_URL, SUPABASE_ANON_KEY)and a request later callssupabase.auth.get_user(token), do subsequent table(...).select/insert/updatecalls automatically run with that user's JWT for RLS, or must the request create/set a client with the incomingAuthorization: Bearer header per request?

💡 Result:

When you call supabase.auth.get_user(token), the Supabase Python client validates the provided JWT but does not automatically associate that token with subsequent database operations (e.g., table.select). Row Level Security (RLS) relies on the Authorization header of the specific database request, so simply validating a user via get_user is insufficient to propagate that user's identity to your database queries [1][2][3]. To perform database operations on behalf of a user, you must explicitly set their session context so that the client includes the appropriate JWT in the Authorization header for those requests [1][4][3]. There are two primary ways to achieve this: 1. Use set_session: If you have a client instance, you can use client.auth.set_session(access_token=jwt, refresh_token=...) [3]. This updates the client's internal state to use that JWT for subsequent requests [4][5]. 2. Initialize with ClientOptions: For server-side or stateless requests (such as in FastAPI or other backend frameworks), it is often recommended to create a new client instance per request by passing the JWT in the headers via ClientOptions [6][1][7]: from supabase import create_client, ClientOptions options = ClientOptions(headers={"Authorization": f"Bearer {your_jwt}"}) client = create_client(SUPABASE_URL, SUPABASE_ANON_KEY, options=options) This ensures the client correctly sends the user's JWT in the Authorization header for all API calls, allowing Supabase to enforce RLS policies based on that user's identity [6][8]. Note: Earlier versions of the library had bugs where the client would overwrite custom Authorization headers with the anon key, but these have been addressed in later versions of the 2.x release line [2][8][9]. If you are working in a backend context where you have elevated privileges, you might alternatively use the service role key, though this bypasses RLS and should be used with caution [4][3]. Top results: [6][1][8][7][3]

Citations:

1: Supabase Storage API Session is None in FastAPI with AuthBearer supabase/supabase-py#869

2: Create a client with Auth context of a user supabase/supabase-py#420

3: https://stackoverflow.com/questions/78050169/supabase-with-jwt-auth-from-python-client-doesnt-work-with-rls

4: cannot authenticate user supabase/supabase-py#645

5: fix: refresh token updates auth headers and sets realtime auth supabase/supabase-py#1171

6: Improving the flow of handling sessions for better clarity due to RLS issues supabase/supabase-py#915

7: bug: client access token for auth admin API is not properly refreshed supabase/supabase-py#1143

8: fix(user auth context): do not overwrite provided client options Authorization header supabase/supabase-py#766

9: Create a client with Auth context of a user supabase/supabase-py#420

Bind the caller’s JWT to Supabase requests so RLS actually applies

supabase.auth.get_user(token) only validates the JWT; with supabase-py, it does not automatically associate that JWT with subsequent table(...).select/insert/update calls made on a shared create_client(SUPABASE_URL, SUPABASE_ANON_KEY) instance. As a result, the later profiles/tickets DB operations won’t run in the caller’s RLS context. Create a per-request client/session (e.g., per-request ClientOptions with Authorization: Bearer <jwt> or client.auth.set_session(...)) and use that client for every ticket/profile query.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/main.py` around lines 38 - 47, The current shared supabase client (created via create_client and stored in supabase) only validates tokens with supabase.auth.get_user(token) but does not bind the caller JWT to subsequent table(...).select/insert/update calls, so RLS won’t apply; fix by creating a per-request Supabase Client or session using the incoming JWT (either build a per-request Client with ClientOptions including headers Authorization: Bearer <jwt> or call client.auth.set_session(...) with the token) and use that per-request client instance for all profiles/tickets DB operations instead of the global supabase object (referencing create_client, Client, supabase, supabase.auth.get_user, table(...).select/insert/update, and auth.set_session).

coderabbitai · 2026-06-07T13:43:33Z

+@app.post("/tickets")
+@limiter.limit("30/minute")
+async def create_ticket(ticket: dict, request: Request):
+    """Save a new ticket into the system (persisted to Supabase)."""
+    if not supabase:
+        raise HTTPException(status_code=500, detail="Database connection not initialized")
+
+    res = supabase.table("tickets").insert(ticket).execute()
+    if not res.data:
+        raise HTTPException(status_code=400, detail="Failed to create ticket")
+
+    print(f"[DB] Ticket created: {res.data[0].get('id')}")
+    return res.data[0]
+
+
+@app.patch("/tickets/{ticket_id}")
+@limiter.limit("60/minute")
+async def update_ticket(ticket_id: str, updates: dict, request: Request):
+    """Partially update a ticket's fields (e.g., status, viewed_at) via Supabase."""
+    if not supabase:
+        raise HTTPException(status_code=500, detail="Database connection not initialized")

-    raise HTTPException(status_code=404, detail="Ticket not found")
+    res = supabase.table("tickets").update(updates).eq("id", ticket_id).execute()


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

POST /tickets and PATCH /tickets/{ticket_id} reopen an unguarded write path.

These endpoints accept raw dict payloads and write them directly to tickets without the bearer-token, user-id, or tenant checks added to /tickets/save. That leaves an alternate route that can bypass the new spoofing protection and mutate server-controlled columns if table policies allow it. Please either remove these routes or apply the same auth helper plus a whitelisted request model for allowed fields.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/main.py` around lines 706 - 728, The create_ticket and update_ticket endpoints accept raw dicts and allow unguarded writes; either remove these routes or harden them by reusing the same auth helper used by /tickets/save to validate the bearer token, user-id and tenant, then validate and sanitize the request body against a whitelisted Pydantic model (only allowed fields like title, body, status, etc.), explicitly strip or reject any server-controlled columns (id, created_at, tenant_id, user_id, viewed_at), and only pass the sanitized payload to supabase.table("tickets").insert(...) or .update(...). Ensure update_ticket also verifies the ticket belongs to the authenticated user/tenant before applying updates.

ritesh-1918 · 2026-06-07T15:52:13Z

Hi @ArshVermaGit! Thanks for the contribution. I have triaged your PR and set it to merge into the gssoc branch.

⚠️ MANDATORY GSSOC ONBOARDING STEPS:
Before your PR points are finalized on the leaderboard, you MUST complete these required steps:

⭐ Star this repository: https://github.com/ritesh-1918/HELPDESK.AI (Mandatory)
👤 Follow the Project Admin: https://github.com/ritesh-1918 (Mandatory)
💼 Connect on LinkedIn: https://www.linkedin.com/in/ritesh1908/ (Mandatory)

Welcome to the HELPDESK.AI developer family! 🚀💻

ArshVermaGit added 9 commits June 7, 2026 04:14

Fix Hardcoded Supabase Service Key Bypass

fb4320c

Fix Insecure Tenant Auth (ID Spoofing)

fdb999f

Fix Vulnerable/Wildcard CORS Policy

1a0f672

Fix Missing Rate Limiting on Data Endpoints

e1d93d0

Fix Hardcoded Admin Role Injection

7eeb572

Fix In-Memory Data Loss Vulnerability

04746b7

Fix Thread-Blocking Synchronous AI Inference

a26509b

fix: add pagination to get_tickets endpoint

d1dc6bf

fix: resolve race condition in file logging for log_correction endpoint

1f54460

ArshVermaGit commented Jun 7, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 7, 2026

View reviewed changes

ritesh-1918 changed the base branch from main to gssoc June 7, 2026 15:52

ritesh-1918 added gssoc GirlScript Summer of Code gssoc:approved GSSoC Approved PR level:intermediate Intermediate level difficulty quality:exceptional Exceptional code quality type:bug Bug fix labels Jun 7, 2026

ritesh-1918 merged commit 1f54460 into ritesh-1918:gssoc Jun 7, 2026
9 of 10 checks passed

ArshVermaGit deleted the main_9 branch June 7, 2026 16:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve race condition in file logging for /ai/log_correction endpoint#2152

fix: resolve race condition in file logging for /ai/log_correction endpoint#2152
ritesh-1918 merged 9 commits into
ritesh-1918:gssocfrom
ArshVermaGit:main_9

ArshVermaGit commented Jun 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Jun 7, 2026

Uh oh!

coderabbitai Bot commented Jun 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (2 warnings)

Uh oh!

ArshVermaGit left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 7, 2026

Uh oh!

coderabbitai Bot Jun 7, 2026

Uh oh!

coderabbitai Bot Jun 7, 2026

Uh oh!

ritesh-1918 commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ArshVermaGit commented Jun 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

vercel Bot commented Jun 7, 2026

Uh oh!

coderabbitai Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (2 warnings)

Uh oh!

ArshVermaGit left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

ritesh-1918 commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ArshVermaGit commented Jun 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 7, 2026 •

edited

Loading