fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872)#3245
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a sandbox resource-leak in long-running services by ensuring the host-side HTTP client created by each AioSandbox is explicitly closed during provider teardown, preventing unreclaimed sockets/transports from accumulating over repeated sandbox lifecycle churn.
Changes:
- Added an idempotent, best-effort
AioSandbox.close()to close the underlying (wrapped)httpxclient. - Updated
AioSandboxProvider.release()/destroy()to close cachedAioSandboxinstances after removing them from provider maps (and before/alongside backend teardown). - Added regression tests covering the new close semantics and provider lifecycle behavior (including error swallowing).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| backend/packages/harness/deerflow/community/aio_sandbox/aio_sandbox.py | Introduces AioSandbox.close() to release host-side HTTP client resources. |
| backend/packages/harness/deerflow/community/aio_sandbox/aio_sandbox_provider.py | Ensures provider lifecycle paths close cached sandbox clients on release/destroy. |
| backend/tests/test_aio_sandbox.py | Adds unit tests for AioSandbox.close() behavior (happy path, idempotency, fallbacks, exceptions). |
| backend/tests/test_aio_sandbox_provider.py | Adds tests asserting release/destroy/shutdown close cached clients and swallow close errors. |
| if self._closed: | ||
| return | ||
| self._closed = True | ||
|
|
||
| client = getattr(self, "_client", None) | ||
| if client is None: | ||
| return | ||
|
|
||
| try: | ||
| wrapper = getattr(client, "_client_wrapper", None) | ||
| httpx_client = getattr(wrapper, "httpx_client", None) if wrapper is not None else None | ||
| if httpx_client is not None and hasattr(httpx_client, "close"): | ||
| httpx_client.close() | ||
| elif hasattr(client, "close"): | ||
| client.close() | ||
| except Exception as e: | ||
| logger.warning(f"Error closing AioSandbox client for {self.id}: {e}") |
f1f029c to
5841583
Compare
…ytedance#2872) AioSandbox allocates a host-side agent_sandbox client (wrapping an httpx.Client) in __init__, but AioSandboxProvider.release/destroy/shutdown only popped provider state and tore down the backend container — the client/transport owned by each cached AioSandbox was never explicitly closed, accumulating unreclaimed sockets in long-running services. - Add AioSandbox.close(): best-effort, idempotent close of the wrapped httpx_client (falls back to top-level client.close()); errors are logged but never raised so backend cleanup is never blocked. - AioSandboxProvider.release()/destroy() now close the cached AioSandbox before dropping it; shutdown() inherits this via destroy().
5841583 to
1369678
Compare
|
@18062706139fcz , here are some comments the PR.
|
…nce#2872) The previous close() only walked one level (wrapper.httpx_client), which resolves to the Fern-generated HttpClient wrapper that has no close(). The real socket-owning httpx.Client lives one level deeper at _client_wrapper.httpx_client.httpx_client, so the close path never fired and host-side sockets still leaked. Resolve the real httpx.Client with graceful degradation; clear self._client under the lock for use-after-close and concurrent double-close safety; mark provider release()/destroy() try/except as defense-in-depth; rewrite TestClose against the real nested structure to lock down the original no-op bug.
Thanks for the review, @WillemJiang ! All four points are addressed in the latest commit ( Could you please check again and see if the PR can be merged now? Thanks! |
Summary
Closes #2872.
AioSandbox.__init__allocates a host-sideagent_sandboxclient (which wraps anhttpx.ClientviaSyncClientWrapper). However,AioSandboxProvider.release/destroy/shutdownonly popped provider-side state and tore down the backend container — the client/transport owned by each cachedAioSandboxwas never explicitly closed. In long-running services that repeatedly cycle sandboxes, this accumulates unreclaimed sockets/host-side resources.Changes
AioSandbox.close()inaio_sandbox.pyhttpx_client(falls back to top-levelclient.close()if not exposed).aio_sandbox_provider.pyrelease()anddestroy()now look up the cachedAioSandbox, drop it from_sandboxesunder the lock, and callsandbox.close()outside the lock before parking in the warm pool / destroying the container.shutdown()inherits this behaviour because it already routes active sandboxes throughdestroy().Validation
Added regression tests:
TestCloseintests/test_aio_sandbox.py— covers happy path (closes wrappedhttpx_client), idempotency, swallowed exceptions, fallback toclient.close(), and missingcloseattribute.tests/test_aio_sandbox_provider.py—release()/destroy()/shutdown()close the cached client, and errors during close don't break provider lifecycle.