Replace nest-asyncio with a dedicated background event loop#506
Open
dotsdl wants to merge 1 commit into
Open
Conversation
`BaseClient._run_async` relied on `nest_asyncio` to allow re-entrant `run_until_complete` so the synchronous client could drive its async, concurrency-bearing methods (batched `asyncio.gather` fan-out, concurrent ProtocolDAGResult retrieval) even from environments with an already-running loop (e.g. Jupyter). `nest_asyncio` does this by monkeypatching asyncio's private internals, which is fragile across CPython releases and is widely considered bad practice. Instead, run a single persistent event loop in a dedicated daemon thread and submit coroutines to it with `asyncio.run_coroutine_threadsafe(...).result()`. The calling thread blocks on the result while the loop thread does the work, so: - all existing `gather`/`as_completed` concurrency is preserved (it runs inside the loop thread); - re-entrancy "just works" without monkeypatching --- a caller already inside a running loop simply waits on another thread; - `alru_cache` stays bound to one stable, process-wide loop, which is the invariant the previous shared-loop dance was straining to maintain. The loop is created lazily under a lock so concurrent callers cannot start competing loops. `nest-asyncio` is dropped from the client environment, as it is no longer used anywhere. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #506 +/- ##
==========================================
+ Coverage 78.91% 79.56% +0.65%
==========================================
Files 29 29
Lines 4808 4830 +22
==========================================
+ Hits 3794 3843 +49
+ Misses 1014 987 -27 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
The synchronous client drives async methods through
BaseClient._run_async. Those methods do real concurrent I/O — batchedasyncio.gatherfan-out in_batched_attribute_getter/_setter, and concurrentProtocolDAGResultretrieval — so the async stack earns its keep and isn't going anywhere.The problem was only ever how sync called into async:
_run_asyncusednest_asyncioto permit re-entrantrun_until_completewhen a loop is already running (e.g. Jupyter).nest_asyncioachieves this by monkeypatching asyncio's private internals — fragile across CPython releases and widely considered bad practice. It was also an undeclared-in-test-env dependency, which recently turned CI red acrossmainand every open PR (ModuleNotFoundError: No module named 'nest_asyncio').Change
Run a single persistent event loop in a dedicated daemon thread and submit coroutines with
asyncio.run_coroutine_threadsafe(coro, loop).result(). The calling thread blocks on the result while the loop thread does the work.gather/as_completedfan-out runs inside the loop thread, unchanged.alru_cachebinds to the loop it first runs on; a single process-wide loop keeps that invariant solid (this is what the old per-class_shared_event_loopdance was straining toward).threading.Lockso concurrent callers can't start competing loops.nest-asynciodropped fromalchemiscale-client.yml— no longer used anywhere.Verification
Behavioral smoke test of the new machinery passes, including the key cases: loop reuse across calls, concurrent
gatherinside a submission, and invocation from within an already-running event loop (the scenarionest_asyncioexisted for). Existing client integration tests exercise the real paths.Relationship to #505
This supersedes #505 (which added
nest-asyncioto the test env as an immediate CI unblock). Once this merges,nest-asynciois no longer needed at all and #505 can be closed. Merging #505 first is harmless — this PR then removes the now-unused dependency.🤖 Generated with Claude Code