Skip to content

fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872)#3245

Merged
WillemJiang merged 2 commits into
bytedance:mainfrom
18062706139fcz:fix/aio-sandbox-client-teardown-2872
Jun 2, 2026
Merged

fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872)#3245
WillemJiang merged 2 commits into
bytedance:mainfrom
18062706139fcz:fix/aio-sandbox-client-teardown-2872

Conversation

@18062706139fcz
Copy link
Copy Markdown

Summary

Closes #2872.

AioSandbox.__init__ allocates a host-side agent_sandbox client (which wraps an httpx.Client via SyncClientWrapper). However, AioSandboxProvider.release/destroy/shutdown only popped provider-side state and tore down the backend container — the client/transport owned by each cached AioSandbox was never explicitly closed. In long-running services that repeatedly cycle sandboxes, this accumulates unreclaimed sockets/host-side resources.

Changes

  • Add AioSandbox.close() in aio_sandbox.py
    • Best-effort, idempotent close of the wrapped httpx_client (falls back to top-level client.close() if not exposed).
    • Errors are logged at WARNING and swallowed so backend/container cleanup is never blocked.
  • Update aio_sandbox_provider.py
    • release() and destroy() now look up the cached AioSandbox, drop it from _sandboxes under the lock, and call sandbox.close() outside the lock before parking in the warm pool / destroying the container.
    • shutdown() inherits this behaviour because it already routes active sandboxes through destroy().

Validation

Added regression tests:

  • TestClose in tests/test_aio_sandbox.py — covers happy path (closes wrapped httpx_client), idempotency, swallowed exceptions, fallback to client.close(), and missing close attribute.
  • 5 new tests in tests/test_aio_sandbox_provider.pyrelease()/destroy()/shutdown() close the cached client, and errors during close don't break provider lifecycle.
$ pytest backend/tests/test_aio_sandbox.py backend/tests/test_aio_sandbox_provider.py \
         backend/tests/test_aio_sandbox_local_backend.py backend/tests/test_aio_sandbox_readiness.py
======================== 62 passed, 1 warning in 0.77s =========================

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 26, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a sandbox resource-leak in long-running services by ensuring the host-side HTTP client created by each AioSandbox is explicitly closed during provider teardown, preventing unreclaimed sockets/transports from accumulating over repeated sandbox lifecycle churn.

Changes:

  • Added an idempotent, best-effort AioSandbox.close() to close the underlying (wrapped) httpx client.
  • Updated AioSandboxProvider.release() / destroy() to close cached AioSandbox instances after removing them from provider maps (and before/alongside backend teardown).
  • Added regression tests covering the new close semantics and provider lifecycle behavior (including error swallowing).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
backend/packages/harness/deerflow/community/aio_sandbox/aio_sandbox.py Introduces AioSandbox.close() to release host-side HTTP client resources.
backend/packages/harness/deerflow/community/aio_sandbox/aio_sandbox_provider.py Ensures provider lifecycle paths close cached sandbox clients on release/destroy.
backend/tests/test_aio_sandbox.py Adds unit tests for AioSandbox.close() behavior (happy path, idempotency, fallbacks, exceptions).
backend/tests/test_aio_sandbox_provider.py Adds tests asserting release/destroy/shutdown close cached clients and swallow close errors.

Comment on lines +57 to +73
if self._closed:
return
self._closed = True

client = getattr(self, "_client", None)
if client is None:
return

try:
wrapper = getattr(client, "_client_wrapper", None)
httpx_client = getattr(wrapper, "httpx_client", None) if wrapper is not None else None
if httpx_client is not None and hasattr(httpx_client, "close"):
httpx_client.close()
elif hasattr(client, "close"):
client.close()
except Exception as e:
logger.warning(f"Error closing AioSandbox client for {self.id}: {e}")
@18062706139fcz 18062706139fcz force-pushed the fix/aio-sandbox-client-teardown-2872 branch from f1f029c to 5841583 Compare May 29, 2026 03:16
…ytedance#2872)

AioSandbox allocates a host-side agent_sandbox client (wrapping an
httpx.Client) in __init__, but AioSandboxProvider.release/destroy/shutdown
only popped provider state and tore down the backend container — the
client/transport owned by each cached AioSandbox was never explicitly
closed, accumulating unreclaimed sockets in long-running services.

- Add AioSandbox.close(): best-effort, idempotent close of the wrapped
  httpx_client (falls back to top-level client.close()); errors are
  logged but never raised so backend cleanup is never blocked.
- AioSandboxProvider.release()/destroy() now close the cached AioSandbox
  before dropping it; shutdown() inherits this via destroy().
@18062706139fcz 18062706139fcz force-pushed the fix/aio-sandbox-client-teardown-2872 branch from 5841583 to 1369678 Compare May 29, 2026 03:16
@WillemJiang
Copy link
Copy Markdown
Collaborator

@18062706139fcz , here are some comments the PR.

  1. Add a comment explaining why the deep attribute chain is needed (Fern-generated client lacks close())
    The deep attribute chain is fragile. The close path traverses:
    self._client._client_wrapper → SyncClientWrapper → .httpx_client → HttpClient → .httpx_client → httpx.Client
    The Ideal solution: File an upstream issue/PR to add a proper close() method to agent_sandbox.Sandbox and calling the close() method from DeerFlow.

  2. Set self._client = None after closing for use-after-close safety

  3. Simplify getattr(self, "_client", None) to self._client

  4. Consider removing the redundant try/except in the provider or documenting it as defense-in-depth

@WillemJiang WillemJiang added reviewing The PR is in reviewing status question Further information is requested labels Jun 1, 2026
…nce#2872)

The previous close() only walked one level (wrapper.httpx_client), which resolves to the Fern-generated HttpClient wrapper that has no close(). The real socket-owning httpx.Client lives one level deeper at _client_wrapper.httpx_client.httpx_client, so the close path never fired and host-side sockets still leaked.

Resolve the real httpx.Client with graceful degradation; clear self._client under the lock for use-after-close and concurrent double-close safety; mark provider release()/destroy() try/except as defense-in-depth; rewrite TestClose against the real nested structure to lock down the original no-op bug.
@18062706139fcz
Copy link
Copy Markdown
Author

@18062706139fcz , here are some comments the PR.

  1. Add a comment explaining why the deep attribute chain is needed (Fern-generated client lacks close())
    The deep attribute chain is fragile. The close path traverses:
    self._client._client_wrapper → SyncClientWrapper → .httpx_client → HttpClient → .httpx_client → httpx.Client
    The Ideal solution: File an upstream issue/PR to add a proper close() method to agent_sandbox.Sandbox and calling the close() method from DeerFlow.
  2. Set self._client = None after closing for use-after-close safety
  3. Simplify getattr(self, "_client", None) to self._client
  4. Consider removing the redundant try/except in the provider or documenting it as defense-in-depth

Thanks for the review, @WillemJiang ! All four points are addressed in the latest commit (e1ba6041)

Could you please check again and see if the PR can be merged now? Thanks!

@WillemJiang WillemJiang merged commit 5dc2d6c into bytedance:main Jun 2, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested reviewing The PR is in reviewing status

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] AioSandboxProvider teardown does not close AioSandbox client resources

4 participants