pydantic · strawgate · Jun 1, 2026
diff --git a/docs/audit-logs-api.md b/docs/audit-logs-api.md
@@ -5,7 +5,7 @@ description: "Retrieve organization activity logs for security monitoring, compl
 
 # Logfire audit logs API
 
-The Audit Logs API lets you retrieve activity logs for your organization. This feature is available on the [Enterprise plan](./enterprise.md) only.
+The Audit Logs API lets you retrieve activity logs for your organization. This feature is available for Enterprise Cloud organizations and self-hosted deployments.
 
 Each log entry records user actions: logins, project updates, token changes, and more. Use it for security monitoring, compliance reporting, and usage auditing.
 
@@ -20,7 +20,7 @@ Each log entry records user actions: logins, project updates, token changes, and
 
 ## Authentication
 
-Requests are authenticated with a Bearer token scoped to `organizations:auditlog`. See [API Keys docs](./reference/advanced//use-api-keys.md) for instructions on how to generate one.
+Requests are authenticated with a Bearer token scoped to `organization:auditlog`. See [API Keys docs](./reference/advanced/use-api-keys.md) for instructions on how to generate one.
 
 **Type:** Bearer token
 

diff --git a/docs/comparisons/arize-phoenix.md b/docs/comparisons/arize-phoenix.md
@@ -9,7 +9,7 @@ Arize Phoenix is an ML observability platform focused on model monitoring, drift
 | **Primary Focus**    | AI observability for agents and apps                           | ML model monitoring                                             |
 | **Strength**         | AI + application tracing                                       | Drift detection, model performance                              |
 | **Non-AI Tracing**   | Full support                                                   | Limited                                                         |
-| **Language Support** | Python, JS/TS, Rust (SDKs) + any OTel                          | Python-focused                                                  |
+| **Language Support** | Python, JS/TS, Rust (SDKs) + any OTel                          | Python and JS SDKs + OTLP                                       |
 | **Evals**            | Integrated web-UI - Code-based via `pydantic-evals` | Integrated web-UI - Code-based via external library |
 | **Pricing**          | Per-span ($2/million)*                                         | Usage-based                                                     |
 | **Setup**            | 3 lines of code                                                | OTel-based (several lines of code)                              |
@@ -73,7 +73,7 @@ logfire.instrument_openai()
 
 Three lines, and you're observing AI calls with full application context.
 
-**Arize Phoenix** requires more configuration, especially for non-AI instrumentation.
+**Arize Phoenix** is OpenTelemetry-based, but Logfire provides more opinionated shortcuts for application and AI instrumentation in one SDK.
 
 ### Query Interface — Essential for Agentic Coding
 
@@ -86,7 +86,7 @@ Three lines, and you're observing AI calls with full application context.
 
 When you're iterating on AI applications with coding agents, the agent needs to understand production behavior. With SQL, it can ask any question. With proprietary interfaces, it's constrained to anticipated queries.
 
-**Arize Phoenix** has its own query interface optimized for ML metrics but less flexible for ad-hoc analysis.
+**Arize Phoenix** has query tools optimized for ML and trace workflows, but Logfire exposes a PostgreSQL-compatible SQL interface for ad-hoc analysis.
 
 ## Complementary Use
 

diff --git a/docs/comparisons/datadog.md b/docs/comparisons/datadog.md
@@ -6,11 +6,11 @@ Datadog is a comprehensive enterprise monitoring platform. Logfire is an AI-nati
 
 | Feature                 | Logfire                        | Datadog |
 |-------------------------|--------------------------------|---------|
-| **Architecture**        | OpenTelemetry-native           | Proprietary agents |
+| **Architecture**        | OpenTelemetry-native           | Datadog agents plus OTel ingestion |
 | **Pricing Model**       | Per-span ($2/million)*         | Per-host + ingestion + custom metrics |
 | **Host Fees**           | None                           | $15-40/host/month |
-| **AI/LLM Support**      | First-class, one function call | Add-on, separate product |
-| **Query Language**      | SQL (Postgres-compatible)      | Proprietary |
+| **AI/LLM Support**      | First-class, one function call | LLM Observability product with SDK/OTel options |
+| **Query Language**      | SQL (Postgres-compatible)      | Product-specific query tools |
 | **Setup Complexity**    | 3 lines of code                | Agent deployment per host |
 | **Autoscaling Impact**  | Linear cost increase           | High-water-mark billing spikes |
 | **Dashboards & alerts** | Included, SQL-based            | Included, proprietary query language |
@@ -55,7 +55,7 @@ Datadog is a comprehensive enterprise monitoring platform. Logfire is an AI-nati
 
 ### AI/LLM Support
 
-**Datadog** added LLM observability as a separate product. It works, but AI isn't central to the platform's design.
+**Datadog** has an LLM Observability product, including SDK and OpenTelemetry GenAI ingestion options. It is part of a broad enterprise monitoring platform rather than the central design point of the product.
 
 **Logfire** was built for the AI era. [One function call](https://pydantic.dev/docs/logfire/integrations/?utm_source=datadog_comparison_docs) (`logfire.instrument_openai()`) gives you:
 
@@ -67,13 +67,13 @@ Datadog is a comprehensive enterprise monitoring platform. Logfire is an AI-nati
 
 ### OpenTelemetry
 
-**Datadog** uses proprietary agents. While they support OTel export, it's not the native path.
+**Datadog** supports OpenTelemetry ingestion, including GenAI semantic-convention traces for LLM Observability, while much of the broader Datadog experience still uses Datadog agents and product-specific setup.
 
 **Logfire** is OpenTelemetry-native with [first-class integrations for most technologies](https://pydantic.dev/docs/logfire/integrations/?utm_source=datadog_comparison_docs). Any OTel instrumentation works automatically. Your instrumentation is portable: if you ever want to switch, your code doesn't change.
 
 ### Query Language — Essential for Agentic Coding
 
-**Datadog** uses a proprietary query language for dashboards and analysis. This creates limitations:
+**Datadog** uses product-specific query tools for dashboards and analysis. Compared with a PostgreSQL-compatible SQL interface, this creates tradeoffs:
 
 - Learning curve for humans and AI alike
 - Coding agents are constrained to anticipated queries
@@ -86,7 +86,7 @@ Datadog is a comprehensive enterprise monitoring platform. Logfire is an AI-nati
 - **Agentic workflows** — When coding agents debug your AI application, they can write arbitrary queries
 - **Familiar syntax** — No new query language to learn
 
-When you're iterating on AI applications with coding agents, the agent needs to understand production behavior. With SQL, it can ask any question. With proprietary DSLs, it's constrained to what someone anticipated.
+When you're iterating on AI applications with coding agents, the agent needs to understand production behavior. With SQL, it can ask any question. With product-specific query interfaces, it may be constrained by the available interface for that product area.
 
 ## Migration Path
 

diff --git a/docs/comparisons/langfuse.md b/docs/comparisons/langfuse.md
@@ -14,7 +14,7 @@ Both Logfire and Langfuse help you observe AI/LLM applications, but they take fu
 | **Python Support** | First-class (Pydantic team) | Good |
 | **Non-AI Tracing** | Full support | Limited |
 | **LLM Features** | Token tracking, costs, panels | Token tracking, costs, evals, prompt mgmt |
-| **OpenTelemetry** | Native | Export support |
+| **OpenTelemetry** | Native | Native SDK / OTLP ingestion |
 
 *Logfire Cloud pricing (Team or Growth plans). Enterprise pricing available [on request](https://calendar.app.google/k9pkeuNMmzJAJ4Mx5).
 
@@ -45,7 +45,7 @@ This matters because AI applications don't exist in isolation. They call APIs, q
 
 ### Query Language — Essential for Agentic Coding
 
-**Langfuse** uses a custom UI and API for querying data.
+**Langfuse** uses a custom UI and API for querying data, with OpenTelemetry-based ingestion support for traces.
 
 **Logfire** uses SQL with PostgreSQL-compatible syntax. This is a significant advantage for AI-assisted development:
 

diff --git a/docs/comparisons/sentry.md b/docs/comparisons/sentry.md
@@ -8,7 +8,7 @@ Sentry is a mature error monitoring platform. Logfire is an AI-native observabil
 |---------------------|----------------------------------------|------------------------------------------|
 | **Primary Focus**   | Full observability (logs, traces, AI)  | Error monitoring                         |
 | **App Tracing**     | Core capability                        | Available, not a core focus              |
-| **AI/LLM Support**  | First-class, automatic instrumentation | Generic function tracing only            |
+| **AI/LLM Support**  | First-class, automatic instrumentation | AI Performance/LLM monitoring features   |
 | **Logging**         | Structured logs with full context      | Error-focused                            |
 | **Live View**       | Real-time "pending spans"                | ❌                                        |
 | **Query Interface** | SQL (Postgres-compatible)              | Custom UI                                |
@@ -48,14 +48,9 @@ Sentry is a mature error monitoring platform. Logfire is an AI-native observabil
 
 ### AI/LLM Support
 
-**Sentry** treats AI calls like any other function. You'll see that an error occurred, but you won't see:
+**Sentry** now has AI Performance and LLM monitoring features for supported SDKs and integrations. Those features are useful when your workflow is already centered on Sentry's error and performance tooling.
 
-- What prompt was sent
-- What the model responded
-- Token usage and costs
-- Tool calls and their results
-
-**Logfire** was built for AI applications. One function call gives you complete LLM visibility:
+**Logfire** was built for AI applications and full-stack telemetry from the start. One function call gives you LLM visibility alongside the rest of your OpenTelemetry data:
 
 ```python skip="true" skip-reason="incomplete"
 import logfire
@@ -71,7 +66,7 @@ logfire.instrument_openai()  # That's it
 
 ### SQL-Powered Analytics — Essential for Agentic Coding
 
-**Sentry** uses a custom UI for querying and filtering.
+**Sentry** uses custom query and exploration UIs for querying and filtering.
 
 **Logfire** uses SQL with PostgreSQL-compatible syntax. This is a significant advantage for AI-assisted development:
 

diff --git a/docs/enterprise.md b/docs/enterprise.md
@@ -14,9 +14,9 @@ In addition to the [Team and Growth plans](https://pydantic.dev/pricing), Pydant
 
 ## Enterprise Single Sign-On (SSO)
 
-Logfire Enterprise supports SSO through [Dex](https://github.com/dexidp/dex), an open-source OIDC gateway. The same Dex configuration model works across Enterprise Cloud, Enterprise Dedicated, and Enterprise Self-Hosted deployments.
+Logfire Enterprise SSO is built on [Dex](https://github.com/dexidp/dex), an open-source OIDC gateway.
 
-Dex works with common identity providers including Okta, Azure AD, Auth0, Google Workspace, LDAP/AD, and generic OIDC or SAML providers.
+Enterprise Cloud and Enterprise Dedicated support managed OIDC identity providers such as Okta, Microsoft Azure Entra ID, and Keycloak. Self-hosted deployments configure Dex directly through Helm values, so they can use any connector supported by Dex, including OIDC, SAML, LDAP, GitHub, and Google.
 
 ## Enterprise Cloud
 

diff --git a/docs/evaluate/datasets/evaluations.md b/docs/evaluate/datasets/evaluations.md
@@ -123,14 +123,15 @@ print(f'Fetched {len(dataset.cases)} cases')
 print(f'First case input type: {type(dataset.cases[0].inputs).__name__}')
 ```
 
-If you have custom evaluator types stored with your cases, pass them via `custom_evaluator_types` so they can be deserialized:
+If you have custom evaluator types stored with your cases or dataset, pass them via `custom_evaluator_types` so they can be deserialized:
 
 ```python skip="true" skip-reason="external-connection"
 dataset = client.get_dataset(
     'qa-golden-set',
     input_type=QuestionInput,
     output_type=AnswerOutput,
     custom_evaluator_types=[MyCustomEvaluator],
+    custom_report_evaluator_types=[MyCustomReportEvaluator],
 )
 ```
 

diff --git a/docs/evaluate/datasets/ui.md b/docs/evaluate/datasets/ui.md
@@ -68,7 +68,7 @@ Once created, you can edit the dataset to add a description and define schemas.
 From the dataset detail page, click **Edit** to modify the dataset's configuration. The edit form has two sections:
 
 - **General**: Name and description.
-- **Schemas**: Define JSON schemas for inputs, expected outputs, and metadata. Use the **Generate schema** toggle to have Pydantic AI create schemas from a natural language description of your data shape.
+- **Schemas**: Define JSON schemas for inputs, expected outputs, and metadata. Use the **Generate schema** action to have Pydantic AI create schemas from a natural language description of your data shape.
 
 ## Managing Cases
 
@@ -134,11 +134,13 @@ This preserves a link back to the source trace, so you always know where a test
 From the dataset detail page, click **Export** to download the dataset in one of two formats:
 
 - **JSON**: Raw JSON representation of all cases.
-- **pydantic-evals**: A YAML format compatible with `pydantic_evals.Dataset.from_file()`.
+- **Python (pydantic-evals)**: JSON in the pydantic-evals-compatible `{name, cases, evaluators, report_evaluators}` shape, suitable for loading with `pydantic_evals.Dataset.from_dict()`.
 
 ## What's Next?
 
 Once you have cases in a dataset, you can:
 
+- Add dataset-level or report-level evaluators from the dataset detail page's **Evaluators** tab.
+
 - Run evaluations against it — see [Running Evaluations](evaluations.md).
 - View and compare experiment results — see [Evals: Datasets & Experiments](../../guides/web-ui/evals.md#viewing-experiments).
diff --git a/docs/gateway-migration.md b/docs/gateway-migration.md
@@ -3,19 +3,19 @@ title: "Migrating from Pydantic AI Gateway"
 description: "How to migrate from the legacy gateway.pydantic.dev to the AI Gateway on Pydantic Logfire."
 ---
 
-# Pydantic AI Gateway is Moving to Pydantic Logfire
+# Pydantic AI Gateway Has Moved to Pydantic Logfire
 
-We're consolidating the AI Gateway into Logfire. This means [gateway.pydantic.dev](https://gateway.pydantic.dev/) is being deprecated, and the gateway is now managed through your Logfire account.
+The AI Gateway has moved into Logfire. The legacy [gateway.pydantic.dev](https://gateway.pydantic.dev/) platform has reached end of life, and the gateway is now managed through your Logfire account.
 
 ## Shutdown Timeline
 
 | Date | Event |
 |------|-------|
-| **15 March 2026** | Self-service refunds available in the legacy gateway platform |
+| **15 March 2026** | Self-service refunds became available in the legacy gateway platform |
 | **13 April 2026 at 3pm UTC** | Legacy gateway fully shut down (end of life) |
-| **By end of April 2026** | Automatic refunds processed for any remaining balances |
+| **By end of April 2026** | Automatic refunds were processed for any remaining balances |
 
-**Please migrate before 13 April 2026.** If you need help, email us at [engineering@pydantic.dev](mailto:engineering@pydantic.dev).
+If you still need help after the shutdown, email us at [engineering@pydantic.dev](mailto:engineering@pydantic.dev).
 
 ## Why We Made This Change
 
@@ -32,9 +32,7 @@ Moving the gateway into Logfire unlocks a number of improvements:
 
 ### What happens to my current balance?
 
-From **15 March 2026**, you can request a refund of your remaining balance via the button in the [legacy gateway platform](https://gateway.pydantic.dev). The refund will be issued to the original payment method you used.
-
-If you do not request a refund manually, any outstanding credits will be refunded automatically before the end of April 2026.
+Self-service refunds became available on **15 March 2026** in the [legacy gateway platform](https://gateway.pydantic.dev). Any outstanding credits that were not requested manually were scheduled for automatic refund before the end of April 2026.
 
 ### Do I need to create a new account?
 

diff --git a/docs/guides/web-ui/alerts.md b/docs/guides/web-ui/alerts.md
@@ -10,8 +10,8 @@ With **Logfire**, use Alerts to notify you when certain conditions are met.
 
 Let's see in practice how to create an alert.
 
-1. Go to the **Alerts** tab in the left sidebar.
-2. Click the **Create alert** button.
+1. Go to **Notify** → **Alerts** in the left sidebar.
+2. Click the **New Alert** button.
 
 Then you'll see the following form:
 
@@ -34,10 +34,13 @@ WHERE
 1. The `SELECT ... FROM records` statement is the base query that will be executed. The **records** table contains the spans and logs data. `trace_id` links to the trace in the live view when viewing the alert run results in the web UI.
 2. The `attributes` field is a JSON field that contains additional information about the record. In this case, we're using the `http.route` attribute to filter the records by route.
 
-The **Time window** field allows you to specify the time range over which the query will be executed.
+Use the **Notifications** section to choose:
 
-The **Webhook URL** field is where you can specify a URL to which the alert will send a POST request when triggered.
-For now, **Logfire** alerts only send the requests in [Slack format].
+- **Include rows from**: the time window of data included every time the query runs.
+- **Run the query**: how often Logfire executes the query.
+- **Notify me when**: which result condition sends a notification.
+
+Select one or more notification channels for delivery. If you have not created a channel yet, go to **Notify** → **Delivery** → **Channels** and click **New channel**. For Slack, create a Slack incoming webhook and choose the Slack channel type.
 
 ??? tip "Get a Slack webhook URL"
     To get a Slack webhook URL, follow the instructions in the [Slack documentation](https://api.slack.com/messaging/webhooks).
@@ -84,14 +87,12 @@ Otherwise, you'll see the number of matches highlighted in orange.
 
 ![Alerts list with error](../../images/guide/browser-alerts-error.png)
 
-In this case, you'll also receive a notification in the Webhook URL you've set up.
+In this case, you'll also receive notifications in the channels you've selected.
 
 ## Edit an alert
 
-You can configure an alert by clicking on the **Configuration** button on the right side of the alert.
+You can configure an alert by opening it from the alerts list and clicking **Edit Alert**.
 
 ![Edit alert](../../images/guide/browser-alerts-edit.png)
 
 You can update the alert, or delete it by clicking the **Delete** button. If instead of deleting the alert, you want to disable it, you can click on the **Active** switch.
-
-[Slack format]: https://api.slack.com/reference/surfaces/formatting
diff --git a/docs/guides/web-ui/evals.md b/docs/guides/web-ui/evals.md
@@ -34,9 +34,10 @@ Click a dataset name to open its detail page. The page has tabs for:
 
 - **Experiments** --- all evaluation runs against this dataset
 - **Cases** --- test cases (editable for hosted datasets)
-- **Schema** --- input, output, and metadata schemas
+- **Evaluators** --- dataset-level and report-level evaluators for hosted datasets
+- **Schemas** --- input, output, and metadata schemas
 
-The header shows the dataset name, experiment count, case count, and aggregate pass rate. Use the **Export** button to download cases, the **Edit** button to modify the dataset, or the **`<> SDK`** button to view code snippets for working with this dataset programmatically.
+The header shows the dataset name, experiment count, case count, and aggregate pass rate. Use the **Export** button to download cases as JSON, the **Edit** button to modify the dataset, or the **Push dataset from code** snippet to view code for working with this dataset programmatically.
 
 If the dataset has no experiments yet, the empty state walks you through the setup: define your schema, add test cases, then run your first experiment from code.
 
@@ -51,6 +52,8 @@ Click any experiment row to see detailed results including:
 - **Performance metrics** --- duration, token usage, and custom scores
 - **Evaluation scores** --- detailed scoring from all evaluators
 
+Hosted datasets also include an **Evaluators** tab where you can manage dataset-level and report-level evaluators. To add evaluators to a specific case, edit that case from the **Cases** tab.
+
 ## Comparing Experiments
 
 To compare multiple runs side by side: