feat(allocation-policy): Per-org overrides for bytes-scanned policy#7975
Open
phacops wants to merge 4 commits into
Open
feat(allocation-policy): Per-org overrides for bytes-scanned policy#7975phacops wants to merge 4 commits into
phacops wants to merge 4 commits into
Conversation
…ngPolicy
Add two new Configuration entries on BytesScannedRejectingPolicy:
- organization_referrer_scan_limit_override, keyed by
(organization_id, referrer)
- organization_scan_limit_override, keyed by organization_id
Previously the only way to override the scan limit for the
organization branch was the per-referrer override that applied to
every organization, which made it impossible to tune the limit for a
single noisy org without affecting everyone else.
Overrides on the organization branch are now resolved in order of
specificity, with the first one set winning:
(organization_id, referrer)
> organization_id
> (all orgs, referrer)
> default
The project branch and cross-org behavior are unchanged.
Also fix pre-existing lint/mypy issues in the same files that the
pre-commit hook surfaces once the files are touched (E712 truthiness
asserts, untyped tenant_ids dicts, ResourceIdentifier arg type, and a
misplaced type: ignore exposed by ruff reformatting).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Agent transcript: https://claudescope.sentry.dev/share/xXAC6_vLoFmdoaI4mlEdxiC7CUhp2mohJV2MuJoaNPU
onewland
approved these changes
May 28, 2026
Add two new Configuration entries on BytesScannedRejectingPolicy that forward a hard max_bytes_to_read value to ClickHouse and bypass the sliding-window scan limit for the configured organization: - organization_referrer_max_bytes_to_read, keyed by (organization_id, referrer) - organization_max_bytes_to_read, keyed by organization_id When either is set the query runs at full threads with the configured cap and the sliding window is not consulted; ClickHouse aborts the query if it would scan more than the cap. (org_id, referrer) is more specific and wins over org_id. This complements the existing global limit_bytes_instead_of_rejecting flow, which only caps queries after a tenant exceeds its scan limit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/K7ElxY0inzZzY1icw8PnMWaO4jjRKEoV3AthvEUGg7E
…ueries Sentry queries usually carry both organization_id and project_id, and the policy resolves those to the project_id branch. The new org cap was checked before that resolution, so any project query with an organization_id in tenant_ids would silently pick up the org cap and bypass the project-level sliding-window limit. Move the cap check after _get_customer_tenant_key_and_value() and gate it on customer_tenant_key == "organization_id". Adds a regression test covering the (org_id + project_id) shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/ZggQ4p_AlWKzjkr6xJZ5kCjchDzv_CTIpf-S8WXChVs
`_get_quota_allowance` bypasses the sliding-window scan limit for org-keyed queries that run under a per-org `max_bytes_to_read` cap, but `_update_quota_balance` still recorded those queries' bytes_scanned into the same window. If the cap is later removed, the window has phantom usage from the capped period and queries get rejected against quota they never consumed. Mirror the bypass in `_update_quota_balance` and add a regression test. Co-Authored-By: Claude <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/tvrt5qSDhk3FhvibKVFT_XWjBOhbIdYwDfgHykOwGVw
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add per-organization controls to
BytesScannedRejectingPolicyso individual noisy orgs can be tuned without affecting every other organization.Per-org scan-limit overrides
Two new configs override the sliding-window scan limit on the organization branch:
organization_referrer_scan_limit_override, keyed by(organization_id, referrer)organization_scan_limit_override, keyed byorganization_idOverrides are resolved in order of specificity, with the first one set winning:
Per-org max_bytes_to_read cap
Two more configs forward a hard
max_bytes_to_readvalue to ClickHouse and bypass the sliding-window check entirely:organization_referrer_max_bytes_to_read, keyed by(organization_id, referrer)organization_max_bytes_to_read, keyed byorganization_idWhen set, the query is allowed to run at full threads with the cap applied; ClickHouse aborts it if it would scan more than the cap.
(org_id, referrer)wins overorg_id. This complements the existing globallimit_bytes_instead_of_rejectingflow, which only caps queries after a tenant exceeds its scan limit.The project branch and cross-org behavior are unchanged.
Scope: which queries these levers affect
All four configs above only fire on the policy's organization branch — i.e. when
tenant_idscarriesorganization_idand noproject_id. That's typically cross-project work (org-wide Discover, "All Projects" views, subscription-style aggregations across an org).Single-project queries carry both
organization_idandproject_idand resolve to the project branch, where none of these overrides fire. The only existing knobs on the project branch remainproject_referrer_scan_limit(global default) andreferrer_all_projects_scan_limit_override(global per referrer). A per-org override that fires on the project branch is not added here — if a noisy org's per-project traffic is the problem, that lever still needs to be designed.Concrete usage — raise the cross-project quota for one big org past the current global:
organization_scan_limit_overridewith params{\"organization_id\": <id>}to the higher byte value(org_id, referrer)can still narrow that later viaorganization_referrer_scan_limit_overrideif a specific referrer needs a different numberorganization_max_bytes_to_readis a cap, not a limit raise — it pins a hard ClickHousemax_bytes_to_readon every query from that org and bypasses the sliding window. Use it to contain blast radius, not to grant more headroom.Bug fix on top of the original approval
The Seer review on the first round flagged that the org-cap check ran before the policy resolved whether a query was project-keyed or org-keyed. Because Sentry usually sends both
organization_idandproject_id, a project-keyed query would have silently picked up the org cap and bypassed the project-level sliding-window rate limit. Fixed by moving the cap check below_get_customer_tenant_key_and_value()and gating it oncustomer_tenant_key == \"organization_id\". Added a regression test (test_org_caps_do_not_apply_to_project_queries).Drive-by fixes
Touching the file caused the pre-commit hook to flag pre-existing lint/mypy issues in the same files (E712 truthiness comparisons, untyped
tenant_idsdicts,ResourceIdentifierarg type, and a# type: ignorethat drifted off the offending line after ruff reformatted it). These are fixed in the same PR since the hook blocks otherwise.