-
Notifications
You must be signed in to change notification settings - Fork 1.6k
docs/website: Improve partial evaluation / data filtering documentation #8625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,7 +17,7 @@ Not every construct is supported for every target. | |
| For a step-by-step walkthrough of evaluating a Rego policy _partially_, see [Evaluating a data filter policy](./partial-evaluation). | ||
| ::: | ||
|
|
||
| ## What is Partial Evaluation? | ||
| ## What is Partial Evaluation? {#what-is-partial-evaluation} | ||
|
|
||
| The translation of data policies into queries (like SQL WHERE clauses) is driven by _partial evaluation (PE)_ of a Rego query. | ||
|
|
||
|
|
@@ -31,7 +31,7 @@ When only _known_ values are used, **you can use all of Rego.** | |
|
|
||
| ## Example Preamble | ||
|
|
||
| In our running example, we'll assume a table `fruits` with columns `name`, `colour`, and `price`. These **unknown values** are represented with `input.<TABLE>.<COLUMN>` e.g. `input.fruits.name` | ||
| In our running example, we'll assume a table `fruits` with columns `name`, `colour`, and `price`. | ||
|
|
||
| ```mermaid | ||
| erDiagram | ||
|
|
@@ -42,7 +42,30 @@ erDiagram | |
| } | ||
| ``` | ||
|
|
||
| Our data filters also depend on user information. These **known values** are represented with `input.user` | ||
| ## Context data for Partial Evaluation | ||
|
|
||
| ### Unknowns: database rows | ||
|
|
||
| Database rows are **unknown** at policy evaluation time — OPA does not have access to the database. They are represented in Rego using the convention `input.<TABLE>.<COLUMN>`, e.g. `input.fruits.name` refers to the `name` column of the `fruits` table. | ||
|
|
||
| The **METADATA annotation** on the policy package declares which `input` paths are unknown. OPA uses this to know which parts of the policy to leave as conditions rather than evaluate: | ||
|
|
||
| ```rego title="policy.rego" | ||
| package filters | ||
|
|
||
| # METADATA | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This metadata comment is in a bit of a strange place? Did you mean to use the package scope? |
||
| # scope: document | ||
| # compile: | ||
| # unknowns: [input.fruits] | ||
|
|
||
| include if input.fruits.name == "banana" | ||
| ``` | ||
|
|
||
| With `input.fruits` declared as unknown, OPA will not try to resolve `input.fruits.name` during partial evaluation — instead it becomes a column reference in the output SQL. | ||
|
|
||
| ### Known values: request context | ||
|
|
||
| Our data filters also depend on user information. These **known values** are sent to OPA as input at query time and will be substituted during partial evaluation: | ||
|
|
||
| ```json | ||
| { | ||
|
|
@@ -53,6 +76,8 @@ Our data filters also depend on user information. These **known values** are rep | |
| } | ||
| ``` | ||
|
|
||
| They are referenced in the policy as `input.user`, e.g. `input.user.budget`. Because they are not listed in `unknowns`, OPA resolves them to their concrete values during partial evaluation. | ||
|
|
||
| ## Simple comparisons | ||
|
|
||
| The fragment supports simple comparisons, such as `==`, `!=`, `<`, `>`, `<=`, `>=`, between _unknown_ and _known_ values. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -34,3 +34,70 @@ sequenceDiagram | |
| Database-->>Application: Filtered employees | ||
| Application-->>User: Filtered employees | ||
| ``` | ||
|
|
||
| ## A quick example | ||
|
|
||
| Consider an `employees` database table with salary information. The question is: **whose salaries can a Director see?** | ||
|
|
||
| The rule is: Directors may see the salaries of employees in their own department. When Alice (Engineering Director) lists employees, the highlighted rows are what she should see: | ||
|
|
||
| <table> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use markdown tables too, might be easier to read for future editors. |
||
| <thead> | ||
| <tr><th>name</th><th>department</th><th>role</th><th>salary</th></tr> | ||
| </thead> | ||
| <tbody> | ||
| <tr style={{backgroundColor: 'var(--ifm-color-warning-contrast-background)'}}><td>Alice</td><td>engineering</td><td>director</td><td>130000</td></tr> | ||
| <tr style={{backgroundColor: 'var(--ifm-color-warning-contrast-background)'}}><td>Bob</td><td>engineering</td><td>engineer</td><td>90000</td></tr> | ||
| <tr style={{backgroundColor: 'var(--ifm-color-warning-contrast-background)'}}><td>Carol</td><td>engineering</td><td>engineer</td><td>85000</td></tr> | ||
| <tr style={{backgroundColor: 'transparent'}}><td>Dave</td><td>marketing</td><td>director</td><td>120000</td></tr> | ||
| <tr style={{backgroundColor: 'transparent'}}><td>Eve</td><td>marketing</td><td>manager</td><td>95000</td></tr> | ||
| </tbody> | ||
| </table> | ||
|
|
||
| OPA can be used to derive the needed SQL filter at run time, leveraging OPA's [partial evaluation](./filtering/partial-evaluation) feature. | ||
|
|
||
| **1. Input passed to OPA** | ||
|
|
||
| Alice is a _Director_ of the _Engineering_ department. The application sends her user context to OPA: | ||
|
|
||
| ```json title="input.json" | ||
| { | ||
| "user": { | ||
| "name": "Alice", | ||
| "role": "director", | ||
| "department": "engineering" | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **2. OPA evaluates the policy** | ||
|
|
||
| ```rego title="policy.rego" | ||
| package authz | ||
|
|
||
| # METADATA | ||
| # scope: document | ||
| # compile: | ||
| # unknowns: [input.employees] | ||
|
|
||
| include if { | ||
| input.user.role == "director" # known: true for Alice, consumed | ||
| input.employees.department == input.user.department # ¹unknown == ²known → SQL condition | ||
| } | ||
| ``` | ||
|
|
||
| ¹ The value of `input.employees.department` is _unknown_ during partial policy evaluation — it refers to a table column in the database. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't tend to use this format for notes. I think a bullet list is fine, but we can also use [^1], [^2] etc. |
||
|
|
||
| ² The value of `input.user.department` is known during partial policy evaluation — it resolves to the value `"engineering"` from the `input` document. | ||
|
|
||
| **3. OPA returns a SQL filter** | ||
|
|
||
| ```sql title="SQL filter for Alice" | ||
| WHERE employees.department = 'engineering' | ||
| ``` | ||
|
|
||
| **4. Application Runs Query** | ||
|
|
||
| The application can then query the database using this filter and process or display the returned data. | ||
|
|
||
| For a hands-on walkthrough, see the [SQL Data Filtering Tutorial](./filtering/tutorial-sql-filtering). | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,195 @@ | ||
| --- | ||
| title: "Tutorial: SQL Data Filtering" | ||
| sidebar_position: 6 | ||
| --- | ||
|
|
||
| This tutorial demonstrates end-to-end data filtering with OPA around a concrete question: **whose salaries can a Director see?** | ||
|
|
||
| You will write an authorization policy, use OPA's partial evaluation to derive a SQL `WHERE` clause, and apply that filter to a real database query. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - [OPA installed](../#1-download-opa) | ||
| - [sqlite3](https://sqlite.org/index.html) (pre-installed on macOS and most Linux distributions) | ||
| - `curl` and `jq` | ||
|
|
||
| ## Steps | ||
|
|
||
| ### 1. Create and populate the database | ||
|
|
||
| We'll work with the following dataset: | ||
|
|
||
| | name | department | role | salary | | ||
| | ----- | ----------- | -------- | ------ | | ||
| | Alice | engineering | director | 130000 | | ||
| | Bob | engineering | engineer | 90000 | | ||
| | Carol | engineering | engineer | 85000 | | ||
| | Dave | marketing | director | 120000 | | ||
| | Eve | marketing | manager | 95000 | | ||
|
|
||
| Save the following SQL to a file named `employees.sql`: | ||
|
|
||
| ```sql title="employees.sql" | ||
| CREATE TABLE employees (name TEXT, department TEXT, role TEXT, salary INTEGER); | ||
| INSERT INTO employees VALUES ('Alice', 'engineering', 'director', 130000); | ||
| INSERT INTO employees VALUES ('Bob', 'engineering', 'engineer', 90000); | ||
| INSERT INTO employees VALUES ('Carol', 'engineering', 'engineer', 85000); | ||
| INSERT INTO employees VALUES ('Dave', 'marketing', 'director', 120000); | ||
| INSERT INTO employees VALUES ('Eve', 'marketing', 'manager', 95000); | ||
| ``` | ||
|
|
||
| Then create the database by loading that file: | ||
|
|
||
| ```shell | ||
| sqlite3 company.db < employees.sql | ||
| ``` | ||
|
|
||
| ### 2. Write the policy | ||
|
|
||
| The rule is: Directors may see the salaries of employees in their own department. | ||
|
|
||
| `input.employees` is declared as _unknown_ — it represents database rows that OPA has not seen yet. `input.user` is _known_ at query time and its values will be substituted during partial evaluation. | ||
|
|
||
| Save the following Rego code to a file named `policy.rego`: | ||
|
|
||
| ```rego title="policy.rego" | ||
| package authz | ||
|
|
||
| # METADATA | ||
| # scope: document | ||
| # compile: | ||
| # unknowns: [input.employees] | ||
|
|
||
| include if { | ||
| input.user.role == "director" | ||
| input.employees.department == input.user.department | ||
| } | ||
| ``` | ||
|
|
||
| ### 3. Start OPA | ||
|
|
||
| ```shell | ||
| opa run --server policy.rego | ||
| ``` | ||
|
|
||
| OPA is now listening on `http://localhost:8181`. | ||
|
|
||
| ### 4. Ask OPA for a SQL filter | ||
|
|
||
| In another terminal, call the compile endpoint with the logged-in user as input. Alice is a Director in Engineering: | ||
|
|
||
| ```shell | ||
| curl -s -X POST http://localhost:8181/v1/compile/authz/include \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "Accept: application/vnd.opa.sql.sqlite+json" \ | ||
| -d '{"input": {"user": {"name": "alice", "role": "director", "department": "engineering"}}}' | ||
| ``` | ||
|
|
||
| OPA partially evaluates the policy: | ||
|
|
||
| - `input.user.role == "director"` — both sides are known; the condition is true, so it is consumed. | ||
| - `input.employees.department == input.user.department` — the left hand side is unknown; the known right hand side (`"engineering"`) is substituted, yielding the SQL condition. | ||
|
|
||
| The response: | ||
|
|
||
| ```json | ||
| { | ||
| "result": { | ||
| "query": "WHERE employees.department = 'engineering'" | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ### 5. Query the database | ||
|
|
||
| Extract the filter and use it in a SQL query: | ||
|
|
||
| ```shell | ||
| FILTER=$(curl -s -X POST http://localhost:8181/v1/compile/authz/include \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "Accept: application/vnd.opa.sql.sqlite+json" \ | ||
| -d '{"input": {"user": {"name": "alice", "role": "director", "department": "engineering"}}}' \ | ||
| | jq -r '.result.query') | ||
|
|
||
| sqlite3 company.db "SELECT name, salary FROM employees $FILTER;" | ||
| ``` | ||
|
|
||
| Output — Alice sees all Engineering salaries: | ||
|
|
||
| | name | salary | | ||
| | ----- | ------ | | ||
| | Alice | 130000 | | ||
| | Bob | 90000 | | ||
| | Carol | 85000 | | ||
|
|
||
| Dave is a Director in Marketing, so he gets a different filter from the same policy: | ||
|
|
||
| ```shell | ||
| FILTER=$(curl -s -X POST http://localhost:8181/v1/compile/authz/include \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "Accept: application/vnd.opa.sql.sqlite+json" \ | ||
| -d '{"input": {"user": {"name": "dave", "role": "director", "department": "marketing"}}}' \ | ||
| | jq -r '.result.query') | ||
|
|
||
| sqlite3 company.db "SELECT name, salary FROM employees $FILTER;" | ||
| ``` | ||
|
|
||
| Output — Dave sees all Marketing salaries: | ||
|
|
||
| | name | salary | | ||
| | ---- | ------ | | ||
| | Dave | 120000 | | ||
| | Eve | 95000 | | ||
|
|
||
| ### 6. Non-Directors are denied | ||
|
|
||
| Bob is an Engineer, not a Director. The `input.user.role == "director"` condition is known and false, so no rule body can ever be satisfied — the policy unconditionally denies: | ||
|
|
||
| ```shell | ||
| curl -s -X POST http://localhost:8181/v1/compile/authz/include \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "Accept: application/vnd.opa.sql.sqlite+json" \ | ||
| -d '{"input": {"user": {"name": "bob", "role": "engineer", "department": "engineering"}}}' | ||
| ``` | ||
|
|
||
| Response — the `query` key is absent: | ||
|
|
||
| ```json | ||
| {} | ||
| ``` | ||
|
|
||
| An absent `query` means unconditional deny. The application should return zero rows without issuing a database query. | ||
|
|
||
| :::warning Ensure safe defaults | ||
| OPA returns the filter — it does not enforce it. The application is responsible to use it as intended. | ||
|
|
||
| In this example, if the user is not a Director, no rule body can be satisfied and OPA returns an unconditional deny — represented as a missing `query` key in the result — meaning the application should safely return zero rows. | ||
| ::: | ||
|
|
||
| ## What partial evaluation did | ||
|
|
||
| OPA evaluated the policy with `input.user` fully known. The expressions that involved only known values (`input.user.role == "director"`) were fully evaluated and consumed — they do not appear in the output. Only expressions involving the unknown `input.employees` survived as residual conditions, which OPA then translated into SQL. | ||
|
|
||
| The application never needs to know _how_ the policy decides which salaries are visible. It sends user context and receives a SQL filter (or a deny) to act on. | ||
|
|
||
| ## Handling unconditional results | ||
|
|
||
| | OPA response | Meaning | Application action | | ||
| | -------------------------- | ------------------- | ---------------------------- | | ||
| | `{ "query": "WHERE ..." }` | Conditional allow | Append filter to SQL query | | ||
| | `{ "query": "" }` | Unconditional allow | Run query with no `WHERE` | | ||
| | `{}` | Unconditional deny | Return zero rows, skip query | | ||
|
|
||
| ## Clean up | ||
|
|
||
| Stop the OPA server with `Ctrl+C` in the terminal where it is running, then remove the files created during this tutorial: | ||
|
|
||
| ```shell | ||
| rm employees.sql policy.rego company.db | ||
| ``` | ||
|
|
||
| ## Next steps | ||
|
|
||
| - [Evaluating a Data Filter Policy](./partial-evaluation) — a step-by-step walkthrough of partial evaluation | ||
| - [Writing valid Data Filtering Policies](./fragment) — which Rego constructs are supported as filter conditions | ||
| - [Language SDKs](/ecosystem#languages) — in a production setup, using a language SDK is recommended over raw `curl` calls. The ecosystem page lists SDKs for Go, Java, Python, JavaScript, and more, all of which provide typed clients for the compile API used in this tutorial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure if this was needed? the generated fragment looks to be the same?