Skip to content

feat: Artifacts — attach binary blobs to spans#1931

Draft
adriangb wants to merge 1 commit into
mainfrom
feat/artifacts
Draft

feat: Artifacts — attach binary blobs to spans#1931
adriangb wants to merge 1 commit into
mainfrom
feat/artifacts

Conversation

@adriangb
Copy link
Copy Markdown
Member

What

Adds logfire.Artifact — attach a binary blob (image, audio, PDF, large JSON) to a span. The blob is uploaded to object storage out of band; the span carries only a small content-addressed (sha256) reference, so it is not subject to attribute size limits and does not bloat telemetry.

This is the SDK side. The backend side ships separately in the platform repo (PR linked below once open).

Usage

import logfire

logfire.configure()

logfire.info('chart generated', chart=logfire.Artifact.from_file('chart.png'))
logfire.info('thumbnail', image=logfire.Artifact(png_bytes, content_type='image/png'))
  • Construct from bytes, a file path (Artifact.from_file), or a binary handle (Artifact.from_file_handle — handles temp/spooled files).
  • upload is chosen per artifact:
    • background (default) — never blocks the caller. If the upload queue is over its byte budget, the artifact is dropped with a warning rather than applying backpressure.
    • sync — uploaded inline; the call returns once the blob is stored, so the source can be freed immediately.

How it works

An artifact serializes to a reference object ({"type": "logfire.artifact", "sha256": …}) via the existing json_schema / json_encoder hooks. A background uploader runs the register → PUT → finalize handshake against the backend; signed object-store URLs are PUT to without the bearer token.

Changes

  • logfire/_internal/artifacts.pyArtifact + the ArtifactSource abstraction (bytes / path; designed so streaming sources can be added later).
  • logfire/_internal/exporters/artifact_uploader.py — background uploader (bounded queue, drop-on-full).
  • json_schema / json_encoder / main / config hooks; Artifact + UploadMode exports.
  • logfire-api stubs regenerated; docs page at reference/advanced/artifacts.md.

Verification

tests/test_artifacts.py — 14 tests pass (construction from each source, schema/encoder hooks, span integration, uploader sync/background/dedup/drop-on-full/error-swallowing). ruff + pyright clean.

Note: test_logfire_api.py::test_runtime[with_logfire] currently fails on an unrelated dspy DeprecationWarning raised by instrument_dspy — pre-existing, not introduced here.

🤖 Generated with Claude Code

`logfire.Artifact` attaches an image, audio clip, PDF, or large JSON
payload to a span. The blob is uploaded to object storage out of band
(sync or in the background, chosen per artifact); the span carries only
a small content-addressed reference, so it is not subject to attribute
size limits.

- `Artifact` from bytes, a file path, or a binary handle.
- `upload='background'` (default — never blocks the caller; drops with a
  warning if the upload queue is full) or `upload='sync'` (inline,
  guaranteed delivery, frees the source immediately).
- `json_schema` / `json_encoder` hooks: an artifact serializes to a
  reference object; the blob upload is brokered out of band.
- `logfire-api` stubs and a docs page.

The backend side ships separately in the platform repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adriangb adriangb self-assigned this May 18, 2026
@adriangb
Copy link
Copy Markdown
Member Author

Backend side: pydantic/platform#21850

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying logfire-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: ead1587
Status: ✅  Deploy successful!
Preview URL: https://2206d6f6.logfire-docs.pages.dev
Branch Preview URL: https://feat-artifacts.logfire-docs.pages.dev

View logs

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't generate pyi files in PRs, they're just clutter. the CLAUDE.md should probably make this more explicit.

logging call never blocks. If uploads cannot keep up, queued artifacts are dropped
with a warning rather than stalling your program.
- **`sync`** — the upload runs inline; the logging call returns only once the blob is
stored. Use this when you need delivery guaranteed, or when you want to free the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delivery is far from guaranteed with sync, it still just silently swallows request exceptions without retrying.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should make the guarantee stronger (error) or make the docstrings match current impl?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we eventually want a stronger guarantee, but it doesn't have to be in the first pass. until then, the docs should be accurate.

from ..artifacts import Artifact
from ..utils import log_internal_error

# Default ceiling on bytes queued for background upload. When exceeded, `submit` blocks.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't block, it drops

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make it drop, block, spill to disk...?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for background uploading, i think not blocking is part of the contract. i think it should spill to disk up to a higher limit, then drop.

on the calling thread.
"""

def __init__(self, *, base_url: str, token: str, max_queue_bytes: int = DEFAULT_MAX_QUEUE_BYTES) -> None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_queue_bytes isn't actually user configurable and probably should be

# there can break the signature, so only the backend `/blob` endpoint gets auth.
put_headers = self._auth if target['requires_auth'] else {}
put = requests.request(
target['method'], target['url'], data=artifact.read(), headers=put_headers, timeout=_REQUEST_TIMEOUT
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the artifact is read from a file, it doesn't live in memory in the queue. but max_queue_bytes seems to be intended to save memory, so should it apply to file artifacts? in fact, what if we stored all artifacts to files to allow a bigger queue?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think it would make sense to force all artifacts to buffer on disk, they are almost by definition large

Comment thread logfire/__init__.py
'attach_context',
'url_from_eval',
'Artifact',
'UploadMode',
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do users need to use logfire.UploadMode? this seems like clutter in the root package.

# Network/HTTP failures are operational, not bugs: the artifact reference is
# still recorded on the span, the blob just isn't stored. Best-effort upload —
# never crash the caller's logging call over it.
pass
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will need retries


def __init__(
self,
data: bytes,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO data should accept:

  • bytes
  • str, which is encoded to bytes
  • Path, which is like from_file
  • a file handle, and then from_file_handle isn't needed at all
  • any other object, which goes through the logfire JSON encoding

Then the constructor can handle basically anything, and from_file is only a convenience to treat str as a path instead of actual data, and to maybe save some memory depending on implementation details.

return self._compute()[0]

@property
def size_bytes(self) -> int:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the backend check that this is reported honestly?

UUID: _to_str,
Exception: _to_str,
# An artifact serialises to its reference object; the blob is uploaded separately.
Artifact: lambda o, _: o.reference(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a way to mark the contents of these as always exempt from scrubbing. this shape doesn't allow us to without adding some new machinery. if it was {'logfire.artifact': {...}} then adding logfire.artifact to the scrubber safe keys would work.

that wouldn't help if someone writes secret=Artifact(...), which will get scrubbed by default either way. not sure if we want to do something about that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we just say scrubbing doesn't apply to artifcats for now and figure it out in a followup?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm saying we need to figure out how to make scrubbing not apply to artifacts, right now it can. not the contents, the reference.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems like something we can sort out. i'm just asking if we can ship whatever falls out naturally now (even if it's kinda broken) and come up with the right apis, etc. later or if you think it needs to be bundled into this change?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought the goal right now was to settle on a good API? what kind of review are you looking for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants