[Bug] ESQL Remote Validation Data Stream and Patch Version Validation#6251
Conversation
Bug - GuidelinesThese guidelines serve as a reminder set of considerations when addressing a bug in the code. Documentation and Context
Code Standards and Practices
Testing
Additional Checks
|
| # latest patch a package gates on for the minor, i.e. the stack patch needed to receive the most | ||
| # up-to-date integration package on that minor. Scan each package once and track the newest | ||
| # matching package manifest. | ||
| manifests = load_integrations_manifests() |
There was a problem hiding this comment.
load_integrations_manifests() is called on every invocation, and it's called in a for version in get_stack_versions() loop in rule_validators.py.
Can we think of caching this somewhere. If the function isn't memoized/cached, this is operationally heavy and does the same load over and over again per version. Cache or optimised calls of manifest loads is a good idea.
Why this calls for optimised calls we have seen Manifest growing big with growing integration versions. Especially for AWS and Azure.
There was a problem hiding this comment.
Unfortunately, no we cannot directly cache them given that each rule requirements (fields, versions, etc.) are different/potentially different per rule. Given this, we would need to evaluate them on a per rule level.
Granted we could build a hash map as an optimization so if a rule has the exact same integration info passed that we do not need to compute it again, but the goal of this PR was to go for a less complex approach first, de-duplicate with #6208 and then polish as needed.
There was a problem hiding this comment.
++ Agreed. We should revisit this optimisations in future it will greatly enhance execution times.
| # the data streams present per package version, so use them to skip versions that predate the | ||
| # integration. Only filter when schema data exists for a version, otherwise fall back to kibana | ||
| # compatibility alone (e.g. for synthetic manifests in tests). | ||
| package_schemas = load_integrations_schemas().get(package, {}) |
There was a problem hiding this comment.
Yes, we are tracking this 👍
There was a problem hiding this comment.
++ im going to update the other PR after this lands
| ) | ||
| mappings_lookup[version] = combined_mappings | ||
|
|
||
| for version, mapping in mappings_lookup.items(): |
There was a problem hiding this comment.
The AAD Graph rules (if not min-stacked) should still fail here somehow?
In pseudo it would be along the lines of:
- If patch floor exists, get M.M
- If M.M is in get_stack_versions() lookup
- Find min-stack floor for rule
- If no min-stack or min-stack floor < M.M
- Error and suggest min-stack to earliest M.M.0 available.
We still don't want to push a rule to 8.19.4 stack who cannot pull an integration version 1.37.0 because it's gated at 8.19.10? We still have to min-stack the rule to the earliest M.M.0 just based on our release versioning.
There was a problem hiding this comment.
I think we do want to push the rule to 8.19.4 even though the integration version required 8.19.10. 8.19.* is still supported (including lower patch versions). So customers will still get updates, and thereby their stack does support the rule if they choose to update.
In this way, if a customer is on 8.19.9 for example and they have the azure integration with version 1.34.1 installed, they will see the available update in the stack.

They will see a similar info/warning in the rule that the datastream/integration is either not installed (but available) or that it needs to be updated (only if it is installed), which is the desired flow for rules with integration updates.

As this warning shows there is no data (which is the intent)

Not shipping the rule, would be a change to how we ship rules, as we do ship rules that require customers to update their stack patch versions and/or integration versions.
There was a problem hiding this comment.
Makes sense if the warning is that explicit.
There was a problem hiding this comment.
Pull request overview
This PR fixes ES|QL remote validation and related_integrations version resolution by ensuring integration package selection accounts for whether a referenced data stream actually exists in a given package version, and by inferring an appropriate stack patch per minor when integrations gate features behind later patches.
Changes:
- Infer the correct stack patch per minor based on the rule’s referenced integrations during ES|QL stack-version sweeps.
- Add
find_latest_integration_patch_for_minor()to derive the patch floor implied by integration manifests’ Kibana version conditions. - Update
find_least_compatible_version()to skip package versions that predate the requested integration/data stream when schema data is available.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pyproject.toml | Bumps project version to 1.6.48. |
| detection_rules/rule_validators.py | Updates ES |
| detection_rules/integrations.py | Adds patch-floor inference helper and enhances least-compatible version selection using per-version integration schema presence. |
shashank-elastic
left a comment
There was a problem hiding this comment.
The fix is correct and well-motivated with good evidence. Concerns around PR #6208 is being addressed. This is good to go 🚀
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…detection-rules into patch_version_to_latest
…ranges Merge main (#6251) and port data-stream schema filtering into find_compatible_version_range instead of find_least_compatible_version. Keep find_latest_integration_patch_for_minor for ES|QL validation. Integration-specific export uses schema walk plus schema-floor fallback for streams gated behind newer package versions (e.g. aadgraphactivitylogs).
Pull Request
Issue link(s):
Summary - What I changed
Fixes integration version resolution that ignored whether a referenced data stream actually exists in the resolved package, causing ES|QL remote validation and
related_integrationsto use the wrong package version. This surfaced on rules usinglogs-azure.aadgraphactivitylogs-*(added in azure1.37.0), which resolved against older packages that predate the data stream. Please see the related issue for more details.Changes by file:
detection_rules/integrations.py/find_latest_integration_patch_for_minor(): Thestack-schema-map.yamlkeys stacks atMAJOR.MINOR.0, but an integration may gate its latest package behind a later patch (e.g. azure~8.19.10). Validating at the literal.0resolves an older package that predates the data stream. This helper reads the integration manifests and returns the latest patch the rule's own integrations gate on for a given minor, so the inferred stack version reflects what a customer on that minor would actually receive. Only the rule's packages are inspected, not the full manifest.detection_rules/rule_validators.py/ESQLValidator.remote_validate_rule(): The stack-version sweep now infers the patch per minor from the rule's integrations (e.g.8.19.0->8.19.10,9.4.0stays9.4.0) instead of using the literal.0, so each minor resolves the up-to-date package based on whether or not this is needed for the integration versions.detection_rules/integrations.py/find_least_compatible_version(): Previously this code decided compatibility from the package-levelconditions.kibana.versiononly and never checked whether the requested integration/data stream existed in that version (theintegrationargument was used only in the error message). It now consults the integration schemas and skips versions that predate the data stream, so e.g.azure:aadgraphactivitylogs @ 8.19.10resolves to^1.37.0instead of^1.0.0. It falls back to kibana-compatibility alone when no schema data exists for a version.How To Test
new-rule/azure-ad-graph-potential-roadrecon-enumpython -m detection_rules view-rule rules/integrations/azure/discovery_aad_graph_roadrecon_aiohttp_enumeration.toml --esql-remote-validation8.19.0, resolves azure1.34.1(noaadgraphactivitylogs), and fails with:After the fix: the related integrations populate correctly from the datastream, with the correct version of the package that is validated on the correct patch version of the stack.
ES|QL Rule testing results
fr_esql_validation.txt
Backport tests
Note if you reproduce these manually, you will need to set an
env GITHUB_EVENT_NAME=pushto properly duplicate the results.Note
When reviewing it will be necessary to have a remote stack setup for ES|QL validation. One must also verify that unit tests pass with this remote validation, as unit tests on the PR will not have the remote validation set since there are no ES|QL rule changes here.
Remote testing passed
Checklist
bug,enhancement,schema,maintenance,Rule: New,Rule: Deprecation,Rule: Tuning,Hunt: New, orHunt: Tuningso guidelines can be generatedmeta:rapid-mergelabel if planning to merge within 24 hoursContributor checklist