planner: support pushing Limit and TopN to individual partial paths of IndexMerge#68772
Conversation
…f IndexMerge using SortItemsHints Introduce SortItemsHints on PhysicalProperty so that when TopN sits directly above a DataSource, the ORDER BY columns can be passed as soft hints. DataSource uses these hints when selecting alternatives for OR-type IndexMerge, preferring partial paths that can satisfy the sort order. When not all partial paths satisfy the hints, Limit is pushed to the ordered ones while TopN is pushed to the rest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@time-and-fate I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a soft sort-hints flow: TopN extracts simple-column sort hints into PhysicalProperty.SortItemsHints; the index-merge matcher records per-partial-path hint satisfaction on candidatePath and CopTask; attach2Task4PhysicalTopN uses that info to push Limit to hint-satisfying partial readers and TopN to others. Tests and expected outputs updated. ChangesIndex-Merge Soft Sort Hints
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/planner/core/operator/physicalop/task_base.go (1)
462-495:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate MemoryUsage to account for the new IdxMergePartPlansSatisfySortHints field.
The
MemoryUsage()method does not account for the newly addedIdxMergePartPlansSatisfySortHintsslice field. This causes the memory usage to be underreported.📊 Proposed fix to update memory calculation
- sum = size.SizeOfInterface*(2+int64(cap(t.IdxMergePartPlans)+cap(t.RootTaskConds))) + size.SizeOfBool*3 + size.SizeOfUint64 + - size.SizeOfPointer*(3+int64(cap(t.CommonHandleCols)+cap(t.TblCols))) + size.SizeOfSlice*4 + t.PhysPlanPartInfo.MemoryUsage() + sum = size.SizeOfInterface*(2+int64(cap(t.IdxMergePartPlans)+cap(t.RootTaskConds))) + size.SizeOfBool*3 + size.SizeOfUint64 + + size.SizeOfPointer*(3+int64(cap(t.CommonHandleCols)+cap(t.TblCols))) + size.SizeOfSlice*5 + + size.SizeOfBool*int64(cap(t.IdxMergePartPlansSatisfySortHints)) + t.PhysPlanPartInfo.MemoryUsage()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/operator/physicalop/task_base.go` around lines 462 - 495, The MemoryUsage method on CopTask omits the new IdxMergePartPlansSatisfySortHints slice; update CopTask.MemoryUsage to include the slice header and elements: add the slice header size (size.SizeOfSlice) and account for its capacity in the initial capacity sum (similar to how IdxMergePartPlans or RootTaskConds are counted), and then iterate over t.IdxMergePartPlansSatisfySortHints to add each element's memory (call element.MemoryUsage() if elements are structs with MemoryUsage, otherwise add the appropriate primitive size such as size.SizeOfBool). Ensure the field name IdxMergePartPlansSatisfySortHints and function CopTask.MemoryUsage are the referenced locations to modify.
🧹 Nitpick comments (1)
pkg/planner/core/operator/physicalop/task_base.go (1)
415-418: 💤 Low valueConsider enforcing the documented length invariant.
The documentation states "Length equals len(IdxMergePartPlans)," but downstream code in
task.gouses defensive bounds checking (i < len(copTask.IdxMergePartPlansSatisfySortHints)) before accessing elements. This suggests the invariant may not always hold or is not enforced at construction time.Consider either:
- Adding a check when the field is set to ensure lengths match, or
- Updating the documentation to clarify that the lengths may differ in edge cases
Based on context snippet 3 from pkg/planner/core/task.go:1458-1494, which uses bounds-checked access patterns.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/operator/physicalop/task_base.go` around lines 415 - 418, The field IdxMergePartPlansSatisfySortHints claims its length equals len(IdxMergePartPlans) but callers (e.g., task.go using copTask.IdxMergePartPlansSatisfySortHints) defensively check bounds; enforce the invariant at assignment: in convertToIndexMergeScan where IdxMergePartPlansSatisfySortHints is set, ensure its slice is resized to exactly len(IdxMergePartPlans) (truncate if longer, append false values if shorter) so downstream code can assume equal length; alternatively, if a resize is inappropriate, update the documentation comment on IdxMergePartPlansSatisfySortHints to state lengths may differ and callers must bounds-check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/planner/core/task.go`:
- Around line 1459-1472: When creating the pushedDownLimit
(physicalop.PhysicalLimit) for hint-satisfied partial plans in the loop over
copTask.IdxMergePartPlans, preserve the partitioning by copying the PartitionBy
from the source partialPlan into pushedDownLimit (e.g. call the appropriate
setter or assign PartitionBy using
partialPlan.PartitionBy()/partialPlan.SchemaPartition info) before assigning it
into copTask.IdxMergePartPlans[i]; apply the same copy to the other similar
branch around the lines handling root TopN for partitioned flows (the other
pushedDownLimit creation at lines ~1488-1490) so partition semantics are not
lost.
---
Outside diff comments:
In `@pkg/planner/core/operator/physicalop/task_base.go`:
- Around line 462-495: The MemoryUsage method on CopTask omits the new
IdxMergePartPlansSatisfySortHints slice; update CopTask.MemoryUsage to include
the slice header and elements: add the slice header size (size.SizeOfSlice) and
account for its capacity in the initial capacity sum (similar to how
IdxMergePartPlans or RootTaskConds are counted), and then iterate over
t.IdxMergePartPlansSatisfySortHints to add each element's memory (call
element.MemoryUsage() if elements are structs with MemoryUsage, otherwise add
the appropriate primitive size such as size.SizeOfBool). Ensure the field name
IdxMergePartPlansSatisfySortHints and function CopTask.MemoryUsage are the
referenced locations to modify.
---
Nitpick comments:
In `@pkg/planner/core/operator/physicalop/task_base.go`:
- Around line 415-418: The field IdxMergePartPlansSatisfySortHints claims its
length equals len(IdxMergePartPlans) but callers (e.g., task.go using
copTask.IdxMergePartPlansSatisfySortHints) defensively check bounds; enforce the
invariant at assignment: in convertToIndexMergeScan where
IdxMergePartPlansSatisfySortHints is set, ensure its slice is resized to exactly
len(IdxMergePartPlans) (truncate if longer, append false values if shorter) so
downstream code can assume equal length; alternatively, if a resize is
inappropriate, update the documentation comment on
IdxMergePartPlansSatisfySortHints to state lengths may differ and callers must
bounds-check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d923f04a-b5d2-4843-805b-26c9d1562ad3
📒 Files selected for processing (7)
pkg/planner/core/find_best_task.gopkg/planner/core/operator/physicalop/physical_topn.gopkg/planner/core/operator/physicalop/task_base.gopkg/planner/core/task.gopkg/planner/property/physical_property.gotests/integrationtest/r/planner/core/indexmerge_path.resulttests/integrationtest/t/planner/core/indexmerge_path.test
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #68772 +/- ##
================================================
- Coverage 76.3085% 75.3037% -1.0049%
================================================
Files 2041 2025 -16
Lines 563262 567579 +4317
================================================
- Hits 429817 427408 -2409
- Misses 132529 140138 +7609
+ Partials 916 33 -883
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
winoros
left a comment
There was a problem hiding this comment.
No more comments, except for the mentioned naming issue.
…rtialPathMatchResults
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qw4990, winoros The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: ref #65712 close #68773
Problem Summary:
For queries like
SELECT * FROM t WHERE (a = 1 OR b > 5) ORDER BY c LIMIT 5, where indexesidx_ac(a, c)andidx_bc(b, c)are available:a = 1partial path onidx_accan satisfyORDER BY c(single value on the leading columna).b > 5partial path onidx_bccannot satisfyORDER BY c(range scan on the leading columnb).Previously, if not all partial paths of an IndexMerge could satisfy the
ORDER BY, the plan would fall back to a globalTopNwithout any pushdown, scanning all matching rows before sorting.This is Solution 2 described in the issue.
What changed and how does it work?
SortItemsHintsis added toPhysicalProperty. WhenTopNsits directly above aDataSource, theORDER BYcolumns are passed as advisory sort items through the property.DataSourceuses these advisory sort items when selecting alternatives for OR-type IndexMerge, preferring partial paths that can satisfy the sort order.Limitis pushed to the matching ones whileTopNis pushed to the rest.Key changes:
PhysicalProperty: newSortItemsHintsfield for advisory sort preferences (pkg/planner/property/physical_property.go)getPhysTopN: setsSortItemsHintswhen the child is aDataSource(pkg/planner/core/operator/physicalop/physical_topn.go)candidatePath: newmatchWithAdvisorySortItemsflag (true when matching used SortItemsHints) andpartialPathMatchResultsfield storing each partial path'smatchPropertyresult, replacing the simpler boolean satisfaction slice (pkg/planner/core/find_best_task.go)CopTask: newIdxMergeMatchWithAdvisorySortItemsflag andIdxMergePartPlansMatchResultsfield storing each partial path'smatchPropertyresult (pkg/planner/core/operator/physicalop/task_base.go)matchPropForIndexMergeAlternatives/isMatchPropForIndexMerge: collect per-pathmatchPropertyresults inline during matching rather than recomputing afterwards (pkg/planner/core/find_best_task.go)convertToIndexMergeScan: passes each partial path's own match result (and the effective prop — hints-based or original) when building partial scans (pkg/planner/core/find_best_task.go)attach2Task4PhysicalTopN: newhandleAdvisorySortItemsForIndexMergethat pushesLimitto satisfying partial paths andTopNto others. Only applies when some (but not all) paths satisfy, since the existingLimitpushdown handles the all-satisfying case. (pkg/planner/core/task.go)Check List
Tests
Side effects
Documentation
Release note