From the team behind Pydantic Validation, Pydantic Logfire is an observability platform built on the same belief as our open source library — that the most powerful tools can be easy to use.
What sets Logfire apart:
- Simple and Powerful: Logfire's dashboard is simple relative to the power it provides, ensuring your entire engineering team will actually use it.
- SQL: Query your data using standard SQL — all the control and (for many) nothing new to learn. Using SQL also means you can query your data with existing BI tools and database querying libraries.
- OpenTelemetry: Logfire is an opinionated wrapper around OpenTelemetry, allowing you to leverage existing tooling, infrastructure, and instrumentation for many common packages, and enabling support for virtually any language. We offer full support for all OpenTelemetry signals (traces, metrics, and logs).
Feel free to report issues and ask any questions about Logfire in this repository!
This repository contains the JavaScript SDK for logfire and its documentation; the server application for recording and displaying data is closed source.
Depending on your environment, you can integrate Logfire in several ways. Follow the specific instructions below:
Using Logfire from your Node.js script is as simple as getting a write token, installing the package, calling configure, and using the provided API. Let's create an empty project:
mkdir test-logfire-js
cd test-logfire-js
npm init -y es6 # creates package.json with `type: module`
npm install @pydantic/logfire-nodeThen, create the following hello.js script in the directory:
import * as logfire from '@pydantic/logfire-node'
logfire.configure({
token: 'test-e2e-write-token',
advanced: {
baseUrl: 'http://localhost:3000',
},
serviceName: 'example-node-script',
serviceVersion: '1.0.0',
})
logfire.info(
'Hello from Node.js',
{
'attribute-key': 'attribute-value',
},
{
tags: ['example', 'example2'],
}
)
await logfire.shutdown()Run the script with node hello.js, and you should see the span being logged in
the live view of your Logfire project.
First, install the @pydantic/logfire-cf-workers logfire NPM
packages:
npm install @pydantic/logfire-cf-workers logfireNext, add compatibility_flags = [ "nodejs_compat" ] to your wrangler.toml or
"compatibility_flags": ["nodejs_compat"] if you're using wrangler.jsonc.
Add your
Logfire write token
to your .dev.vars file:
LOGFIRE_TOKEN=your-write-token
LOGFIRE_ENVIRONMENT=developmentThe LOGFIRE_ENVIRONMENT variable is optional and can be used to specify the environment for the service.
For production deployment, check the Cloudflare documentation for details on managing and deploying secrets.
One way to do this is through the npx wrangler command:
npx wrangler secret put LOGFIRE_TOKENNext, add the necessary instrumentation around your handler. The tracerConfig
function will extract your write token from the env object and provide the
necessary configuration for the instrumentation:
import * as logfire from 'logfire'
import { instrument } from '@pydantic/logfire-cf-workers'
const handler = {
async fetch(): Promise<Response> {
logfire.info('info span from inside the worker body')
return new Response('hello world!')
},
} satisfies ExportedHandler
export default instrument(handler, {
service: {
name: 'my-cloudflare-worker',
namespace: '',
version: '1.0.0',
},
})A working example can be found in the examples/cloudflare-worker directory.
Note: if you're testing your worker with Vitest, you need to add the following additional configuration to your vitest.config.mts:
export default defineWorkersConfig({
test: {
deps: {
optimizer: {
ssr: {
enabled: true,
include: ['@pydantic/logfire-cf-workers'],
},
},
},
poolOptions: {
workers: {
// ...
},
},
},
})Vercel provides a comprehensive OpenTelemetry integration through the
@vercel/otel package. After following
their integration instructions, add the
following environment variables to your project:
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=https://logfire-api.pydantic.dev/v1/traces
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=https://logfire-api.pydantic.dev/v1/metrics
OTEL_EXPORTER_OTLP_HEADERS='Authorization=your-write-token'This will point the instrumentation to Logfire.
Note
Vercel production deployments have a caching mechanism that might prevent changes from taking effect immediately or spans from being reported. If you are not seeing spans in Logfire, you can clear the data cache for your project.
Optionally, you can use the Logfire API package for creating manual spans.
Install the logfire NPM package and call the respective methods
from your server-side code:
import * as logfire from 'logfire'
export default async function Home() {
return logfire.span(
'A warning span',
{},
{
level: logfire.Level.Warning,
},
async (span) => {
logfire.info('Nested info span')
return <div>Hello</div>
}
)
}A working example can be found in the examples/nextjs directory.
The @vercel/otel package does not support client-side instrumentation, so few additional steps are necessary to send spans and/or instrument the client-side.
For a working example, refer to the examples/nextjs-client-side-instrumentation directory, which instruments the client-side fetch calls.
For this example, we will instrument a simple Express app:
/*app.ts*/
import express, type { Express } from 'express';
const PORT: number = parseInt(process.env.PORT || '8080');
const app: Express = express();
function getRandomNumber(min: number, max: number) {
return Math.floor(Math.random() * (max - min + 1) + min);
}
app.get('/rolldice', (req, res) => {
res.send(getRandomNumber(1, 6).toString());
});
app.listen(PORT, () => {
console.log(`Listening for requests on http://localhost:${PORT}`);
});Next, install the @pydantic/logfire-node and dotenv NPM packages to keep your Logfire write
token in a .env file:
npm install @pydantic/logfire-node dotenvAdd your token to the .env file:
LOGFIRE_TOKEN=your-write-tokenThen, create an instrumentation.ts file to set up the instrumentation. The
@pydantic/logfire-node package includes a configure function that simplifies the setup:
// instrumentation.ts
import * as logfire from '@pydantic/logfire-node'
import 'dotenv/config'
logfire.configure()The logfire.configure call should happen before the actual express module
imports, so your NPM start script should look like this (package.json):
"scripts": {
"start": "npx ts-node --require ./instrumentation.ts app.ts"
},Deno has
built-in support for OpenTelemetry.
The examples directory includes a Hello world example that configures Deno
OTel export to Logfire through environment variables.
Optionally, you can use the Logfire API package for creating manual spans.
Install the logfire NPM package and call the respective methods
from your code.
The logfire/evals subpath provides code-first evaluation APIs for JavaScript
and TypeScript. It mirrors the data model and serialized dataset format from
Python pydantic-evals: define
cases in code or YAML/JSON, run a task as an offline experiment, attach
evaluators, and view the emitted OpenTelemetry data in Logfire.
Use offline Dataset.evaluate() for curated pre-deployment checks, and
withOnlineEvaluation() for sampled production or staging monitoring. Add
logfire as a direct dependency in projects that import this subpath. To send
evaluation traces and log events to Logfire from Node.js, configure
@pydantic/logfire-node before running evals.
import * as logfire from '@pydantic/logfire-node'
logfire.configure()Caseis one scenario: inputs, optional expected output, metadata, and optional case-specific evaluators.Datasetgroups cases and shared evaluators for one task.- A task is the function being evaluated.
Evaluatorinstances inspect each task result and return assertions, scores, labels, or multiple named results.EvaluationReportcontains successful case results, failures, averages, and report-level analyses.
An offline experiment runs every case through the task, applies case-specific evaluators first, then dataset-level evaluators, and returns a report.
import {
Case,
Contains,
Dataset,
EqualsExpected,
Evaluator,
IsInstance,
MaxDuration,
renderReport,
type EvaluatorContext,
} from 'logfire/evals'
interface ClassifyInputs {
text: string
}
interface CaseMetadata {
category: 'happy' | 'failure' | 'neutral'
}
async function classify({ text }: ClassifyInputs): Promise<string> {
const lower = text.toLowerCase()
if (lower.includes('error') || lower.includes('fail')) return 'NEGATIVE'
if (lower.includes('great') || lower.includes('love')) return 'POSITIVE'
return 'NEUTRAL'
}
class StartsWithExpected extends Evaluator<ClassifyInputs, string, CaseMetadata> {
static evaluatorName = 'StartsWithExpected'
evaluate(ctx: EvaluatorContext<ClassifyInputs, string, CaseMetadata>): number {
if (ctx.expectedOutput === undefined) return 0
return ctx.output.startsWith(ctx.expectedOutput) ? 1 : 0
}
}
const dataset = new Dataset<ClassifyInputs, string, CaseMetadata>({
cases: [
new Case<ClassifyInputs, string, CaseMetadata>({
expectedOutput: 'POSITIVE',
inputs: { text: 'I love this!' },
metadata: { category: 'happy' },
name: 'positive-1',
}),
new Case<ClassifyInputs, string, CaseMetadata>({
expectedOutput: 'NEGATIVE',
inputs: { text: 'This failed' },
metadata: { category: 'failure' },
name: 'negative-1',
}),
new Case<ClassifyInputs, string, CaseMetadata>({
evaluators: [new Contains({ value: 'POSITIVE' })],
expectedOutput: 'POSITIVE',
inputs: { text: 'it is great' },
metadata: { category: 'happy' },
name: 'case-specific-check',
}),
],
evaluators: [new IsInstance({ typeName: 'string' }), new EqualsExpected(), new MaxDuration({ seconds: 2 }), new StartsWithExpected()],
name: 'sentiment-classifier',
})
const report = await dataset.evaluate(classify, {
maxConcurrency: 4,
progress: true,
retryTask: { retries: 2 },
taskName: 'classify',
})
console.log(renderReport(report, { includeInput: true, includeOutput: true }))Dataset.evaluate() also accepts metadata, name, repeat, signal,
retryEvaluators, lifecycle, and a custom progress callback.
Cases and evaluators can also be assembled incrementally, which is useful when you generate cases from fixtures or load only part of a suite for a smoke test:
import { Dataset, EqualsExpected, MaxDuration } from 'logfire/evals'
const smokeDataset = new Dataset<ClassifyInputs, string>({
cases: [],
evaluators: [new EqualsExpected()],
name: 'sentiment-smoke',
})
smokeDataset.addCase({
expectedOutput: 'POSITIVE',
inputs: { text: 'great support experience' },
name: 'support-positive',
})
smokeDataset.addCase({
expectedOutput: 'NEGATIVE',
inputs: { text: 'checkout failed' },
name: 'checkout-negative',
})
smokeDataset.addEvaluator(new MaxDuration({ seconds: 1 }))
const smokeReport = await smokeDataset.evaluate(classify, { maxConcurrency: 2 })Custom evaluators can be synchronous or asynchronous. Their return type controls how results are grouped in the report and Logfire UI:
booleanbecomes a pass/fail assertion.numberbecomes a numeric score.stringbecomes a categorical label.{ value, reason }adds an explanation to a scalar result.{ key: value, ... }emits multiple named results from one evaluator.
If an evaluator throws, the failure is recorded on the case without stopping the whole experiment.
The built-in case evaluators cover common checks:
| Evaluator | Use |
|---|---|
EqualsExpected |
Compare output with Case.expectedOutput. |
Equals |
Compare output with a fixed value. |
Contains |
Check substring, array membership, or object key/value containment. |
IsInstance |
Check the runtime constructor name or primitive type. |
MaxDuration |
Assert the task finished within a duration. |
HasMatchingSpan |
Assert the task emitted a span matching a SpanQuery. |
LLMJudge |
Run a user-provided judge callback against a rubric. |
LLMJudge does not bundle a model client. Pass a judge callback per instance
or call setDefaultJudge() once at startup.
Use LLMJudge when the desired behavior is subjective or rubric-based. The
JavaScript SDK deliberately does not choose a model provider; wire the judge
callback to your model client and return { pass, score, reason }.
import { Case, Dataset, LLMJudge, setDefaultJudge } from 'logfire/evals'
setDefaultJudge(async ({ output, rubric }) => {
const text = String(output)
const pass = text.includes('because')
return {
pass,
reason: pass ? 'The answer includes an explanation.' : `Missing explanation for rubric: ${rubric}`,
score: pass ? 1 : 0,
}
})
const explanationDataset = new Dataset<{ question: string }, string>({
cases: [
new Case({
expectedOutput: 'Photosynthesis uses sunlight to make sugar.',
inputs: { question: 'Why do plants need sunlight?' },
name: 'photosynthesis-explanation',
}),
],
evaluators: [
new LLMJudge({
assertion: { evaluationName: 'judge_pass' },
includeExpectedOutput: true,
includeInput: true,
rubric: 'The response answers the question and explains the reasoning.',
score: { evaluationName: 'judge_score' },
}),
],
name: 'explanation-quality',
})Code under evaluation can record custom per-case attributes and numeric metrics
with setEvalAttribute() and incrementEvalMetric(). Evaluators can also
inspect spans emitted by the task with HasMatchingSpan, which is useful when
correctness depends on an internal behavior such as a tool call, cache hit, or
retrieval step.
import * as logfire from '@pydantic/logfire-node'
import { Case, Dataset, HasMatchingSpan, incrementEvalMetric, setEvalAttribute } from 'logfire/evals'
const loaderDataset = new Dataset<{ userId: string }, string>({
cases: [new Case({ inputs: { userId: 'user-123' }, name: 'cache-hit' })],
evaluators: [
new HasMatchingSpan({
query: {
hasAttributes: { cache_hit: true },
nameEquals: 'load user',
},
}),
],
name: 'user-loader',
})
await loaderDataset.evaluate(async ({ userId }) => {
return logfire.span('load user', { cache_hit: true, user_id: userId }, {}, async () => {
setEvalAttribute('cache_policy', 'read-through')
incrementEvalMetric('cache_hits', 1)
return 'Alice'
})
})Report evaluators run once after all cases complete and add experiment-wide
analyses to report.analyses. When Logfire is configured, these analyses are
attached to the experiment span for visualization.
import { Case, ConfusionMatrixEvaluator, Dataset, EqualsExpected } from 'logfire/evals'
const animalDataset = new Dataset<string, string>({
cases: [
new Case({ expectedOutput: 'cat', inputs: 'The cat goes meow', name: 'cat' }),
new Case({ expectedOutput: 'dog', inputs: 'The dog barks', name: 'dog' }),
],
evaluators: [new EqualsExpected()],
name: 'animal-classifier',
reportEvaluators: [
new ConfusionMatrixEvaluator({
expectedFrom: 'expected_output',
predictedFrom: 'output',
title: 'Animal classification',
}),
],
})
const animalReport = await animalDataset.evaluate((text) => {
const lower = text.toLowerCase()
if (lower.includes('cat') || lower.includes('meow')) return 'cat'
if (lower.includes('dog') || lower.includes('bark')) return 'dog'
return 'unknown'
})
console.log(animalReport.analyses)Built-in report evaluators include ConfusionMatrixEvaluator,
PrecisionRecallEvaluator, ROCAUCEvaluator, and
KolmogorovSmirnovEvaluator.
Datasets can be created in code or loaded from YAML/JSON files. File helpers are available in Node, Bun, and Deno:
await dataset.toFile('sentiment.yaml', {
schemaPath: 'sentiment.schema.json',
})
const restored = await Dataset.fromFile<ClassifyInputs, string, CaseMetadata>('sentiment.yaml', {
customEvaluators: [StartsWithExpected],
})The same dataset can be maintained directly as YAML:
# yaml-language-server: $schema=sentiment.schema.json
name: sentiment-classifier
cases:
- name: positive-1
inputs:
text: I love this!
expected_output: POSITIVE
- name: negative-1
inputs:
text: This failed
expected_output: NEGATIVE
evaluators:
- EqualsExpected
- IsInstance: string
report_evaluators:
- ConfusionMatrixEvaluator:
predicted_from: output
expected_from: expected_outputtoText(), fromText(), toObject(), fromObject(), and jsonSchema() are
also available. Custom evaluators that need to round-trip through YAML/JSON
should set a stable static evaluatorName and implement toJSON() when their
constructor needs arguments.
Dataset YAML/JSON uses Python-compatible field names for portable files, for
example expected_output, report_evaluators, predicted_from,
expected_from, and snake_case SpanQuery keys. Built-in evaluator
constructors accept both idiomatic camelCase and serialized snake_case options.
Online evaluation wraps an async function and runs evaluators in the background
after each sampled call. Results are emitted as gen_ai.evaluation.result
OpenTelemetry log events, and optional sinks can receive the same results in
process.
import {
configureOnlineEvals,
Contains,
Evaluator,
OnlineEvaluator,
waitForEvaluations,
withOnlineEvaluation,
type EvaluatorContext,
} from 'logfire/evals'
configureOnlineEvals({
metadata: { deployment: 'staging' },
samplingMode: 'correlated',
})
class NonEmpty extends Evaluator {
static evaluatorName = 'NonEmpty'
evaluate(ctx: EvaluatorContext): boolean {
return String(ctx.output ?? '').length > 0
}
}
async function summarize(text: string): Promise<string> {
return `summary: ${text.slice(0, 80)}`
}
const monitoredSummarize = withOnlineEvaluation(summarize, {
evaluators: [
new NonEmpty(),
new OnlineEvaluator({
evaluator: new Contains({ asStrings: true, caseSensitive: false, value: 'summary' }),
maxConcurrency: 5,
sampleRate: 0.1,
}),
],
extractArgs: ['text'],
sink: ({ failures, results, target }) => {
if (failures.length > 0) console.warn(`${target}: ${failures.length} evaluator failures`)
for (const result of results) console.log(`${result.name}: ${String(result.value)}`)
},
target: 'summarizer',
})
await monitoredSummarize('Logfire collects OpenTelemetry data.')
await waitForEvaluations()Pass bare Evaluator instances to use the default sample rate, or wrap them in
OnlineEvaluator for per-evaluator sampleRate, maxConcurrency, sink, or
error handling. samplingMode: 'independent' samples each evaluator
separately; samplingMode: 'correlated' uses one random draw per call so
lower-rate evaluators are a subset of higher-rate evaluators.
Online evaluator context.inputs is built from JavaScript function parameter
names when they can be inspected. Pass extractArgs: ['name', ...] when
bundled or minified code needs stable input names, or extractArgs: false to
keep positional input values.
- Browser and Cloudflare Worker usage is limited to in-memory datasets and online evaluation; filesystem-backed dataset helpers are not available.
- Browser offline runs should keep
maxConcurrency: 1because there is noAsyncLocalStorageequivalent for isolating case attributes and metrics. withOnlineEvaluation()supports async-returning functions.logfire.configure()auto-installs the evals span-tree processor. If you use your ownTracerProvider, addgetEvalsSpanProcessor()fromlogfire/evals.- Runnable examples live in
examples/node/evals.ts,examples/node/demo_evals.ts, andexamples/node/demo_online_evals.ts.
These Python Pydantic Evals pages are the conceptual reference for the JavaScript API shape:
Local runnable JavaScript examples:
examples/node/evals.tsexamples/node/demo_evals.tsexamples/node/demo_online_evals.tsscripts/runtime-smoke/README.md
The logfire.configure function accepts a set of configuration options that
control the behavior of the instrumentation. Alternatively, you can
use environment variables
to configure the instrumentation.
The logfire package exports several convenience wrappers around the
OpenTelemetry span creation API. The @pydantic/logfire-node package re-exports these.
The following methods create spans with their respective log levels (ordered by severity):
logfire.tracelogfire.debuglogfire.infologfire.noticelogfire.warnlogfire.errorlogfire.fatal
Each method accepts a message, attributes, and optionally, options that let you specify the span tags. The attribute values must be serializable to JSON.
function info(message: string, attributes?: Record<string, unknown>, options?: LogOptions): voidlogfire.span is a convenience wrapper around the OpenTelemetry span creation API that allows you to create a span and execute a callback function within that span's context.
This is useful for creating nested spans or for executing code within the context of a span. Unlike the opentelemetry implementation, the parent span is automatically ended when the callback function completes.
logfire.span('parent sync span overload', {
callback: (_span) => {
logfire.info('nested span')
},
})You can also pass parent spans manually through the parentSpan option:
const mySpan = logfire.startSpan('a manual parent span')
logfire.info('manual child span', {}, { parentSpan: mySpan })
// ensure to end the parent span when done
mySpan.end()In addition to trace, debug, the Logfire API exports a reportError function that accepts a message and a JavaScript Error object. It will extract the necessary details from the error and create a span with the error level.
try {
1 / 0
} catch (error) {
logfire.reportError('An error occurred', error)
}See CONTRIBUTING.md for development instructions.
MIT