planner, executor: support merge sort for IN conditions in IndexMerge partial paths (#67771)#68753
Conversation
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
@time-and-fate This PR has conflicts, I have hold it. |
|
@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis PR centralizes per-worker KV ranges into ChangesIndexMerge merge-sort support for IN conditions
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result (1)
3566-3627:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winResolve both unresolved merge-conflict blocks before merge.
Line 3566 and Line 3641 still contain
<<<<<<<,=======, and>>>>>>>blocks. This leaves the golden result file invalid and will break integration-result verification. Keep one expected branch (the newplan_treeexpectations) and remove conflict markers plus discarded branch content in both regions.🧩 Minimal conflict-resolution shape
-<<<<<<< HEAD -...old expected output... -======= ...chosen expected output... ->>>>>>> 173eaf24799 (planner, executor: support merge sort for IN conditions in IndexMerge partial paths (`#67771`))Also applies to: 3641-3681
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result` around lines 3566 - 3627, The file still contains unresolved git conflict markers (<<<<<<<, =======, >>>>>>>) around two blocks; open the region containing the alternative "explain format = 'brief' ..." vs "explain format = 'plan_tree' ..." outputs and remove the conflict markers and the discarded 'brief' branch content, keeping the new plan_tree expectations (the blocks that show "explain format = 'plan_tree' ...", "Projection", "IndexMerge", "Limit(Build)", "IndexRangeScan", "TableRowIDScan") for both conflict regions so the golden result contains only the plan_tree expectations and no leftover markers.pkg/executor/index_merge_reader.go (2)
253-282:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winSecond unresolved git merge conflict must be resolved.
Another merge conflict block exists here. The conflict appears to be between the old table-scan range building logic and the new unified approach with grouped ranges support.
🐛 Proposed resolution
Remove the conflict markers and keep only the new implementation:
-<<<<<<< HEAD - _, ok := plan[0].(*plannercore.PhysicalIndexScan) - if !ok { - firstPartRanges, secondPartRanges := distsql.SplitRangesAcrossInt64Boundary(e.ranges[i], false, e.descs[i], tbl.Meta().IsCommonHandle) - firstKeyRanges, err := distsql.TableHandleRangesToKVRanges(dctx, []int64{getPhysicalTableID(tbl)}, tbl.Meta().IsCommonHandle, firstPartRanges) - if err != nil { - return nil, err - } - secondKeyRanges, err := distsql.TableHandleRangesToKVRanges(dctx, []int64{getPhysicalTableID(tbl)}, tbl.Meta().IsCommonHandle, secondPartRanges) - if err != nil { - return nil, err - } - keyRanges := append(firstKeyRanges.FirstPartitionRange(), secondKeyRanges.FirstPartitionRange()...) - ranges = append(ranges, keyRanges) - continue -======= + // Determine grouped ranges: use GroupedRanges from the physical plan if available, + // otherwise wrap the flat ranges as a single group. + var ( + groupedRanges [][]*ranger.Range + isIdxScan bool + ) + switch x := plan[0].(type) { + case *physicalop.PhysicalIndexScan: + isIdxScan = true + groupedRanges = x.GroupedRanges + case *physicalop.PhysicalTableScan: + groupedRanges = x.GroupedRanges ->>>>>>> 173eaf24799 (planner, executor: support merge sort for IN conditions in IndexMerge partial paths (`#67771`)) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/executor/index_merge_reader.go` around lines 253 - 282, Remove the leftover git conflict markers and keep the new unified implementation that declares groupedRanges and isIdxScan and switches on plan[0] to handle *physicalop.PhysicalIndexScan (set isIdxScan=true and groupedRanges = x.GroupedRanges) and *physicalop.PhysicalTableScan (groupedRanges = x.GroupedRanges); delete the old branch that used distsql.SplitRangesAcrossInt64Boundary/TableHandleRangesToKVRanges and ensure subsequent code uses the groupedRanges variable produced by this switch (look for symbols groupedRanges, isIdxScan, and plan[0] in index_merge_reader.go).
246-335:⚠️ Potential issue | 🔴 CriticalFix missing
physicalopimport inpkg/executor/index_merge_reader.go
This file uses*physicalop.PhysicalIndexScan/*physicalop.PhysicalTableScan, but the import block contains nogithub.com/pingcap/tidb/pkg/planner/core/operator/physicaloppackage, so thephysicalopselector won’t compile.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/executor/index_merge_reader.go` around lines 246 - 335, The file missing the physicalop import causes unresolved references to *physicalop.PhysicalIndexScan and *physicalop.PhysicalTableScan in IndexMergeReaderExecutor.buildPartialWorkerKVRanges; add the physicalop package to the import block (import "github.com/pingcap/tidb/pkg/planner/core/operator/physicalop" or the repository's equivalent path) so the symbols PhysicalIndexScan and PhysicalTableScan resolve, then run go build to verify no further missing imports.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/executor/index_merge_reader.go`:
- Around line 211-234: Resolve the unresolved git conflict markers by removing
the lines "<<<<<<< HEAD", "=======", and ">>>>>>> 173eaf24799" and merge the two
alternate case arms into a single switch handling block that covers the intended
scan types; ensure you keep the shared logic that calls ResolveCorrelatedColumns
and sets GroupedRanges (the code that calls e.ranges[i], err =
x.ResolveCorrelatedColumns() and the subsequent GroupRangesByCols call), and
include both case labels for the correct types (e.g., case
*plannercore.PhysicalTableScan: and case *physicalop.PhysicalTableScan: or
whichever single type is correct after refactor) so the switch in the function
containing this block compiles cleanly and preserves error handling and grouping
behavior.
In `@pkg/planner/core/find_best_task.go`:
- Around line 1119-1130: Remove the cherry-pick conflict markers and keep the
intended merged logic: replace the conflict block with the loop-based check that
iterates over oneAlternative and uses matchProperty(...).Matched() (i.e.,
declare match := true; for _, oneAccessPath := range oneAlternative { if
!noSortItem && !matchProperty(ds, oneAccessPath, prop).Matched() { match =
false; break } } if !match { ... } ), ensuring you reference noSortItem,
matchProperty, oneAlternative/oneAccessPath and prop exactly and remove any
remaining <<<<<<< / ======= / >>>>>>> lines so the file compiles.
In `@pkg/planner/core/operator/physicalop/physical_index_scan.go`:
- Around line 697-701: The code references prop.NeedKeepOrder() and
prop.GetSortDescForKeepOrder() on PhysicalProperty but those methods don't
exist; either add those methods to PhysicalProperty in
pkg/planner/property/physical_property.go (implement NeedKeepOrder() to return
the current keep-order predicate and GetSortDescForKeepOrder() to produce the
sort descriptor from the existing SortItems/PartialOrderInfo fields), or update
physical_index_scan.go to use the existing keep-order APIs on PhysicalProperty
(replace NeedKeepOrder()/GetSortDescForKeepOrder() calls with the actual
existing methods/fields that represent "keep order" and sort descriptors),
ensuring is.Desc and is.KeepOrder are set from the correct existing APIs.
---
Outside diff comments:
In `@pkg/executor/index_merge_reader.go`:
- Around line 253-282: Remove the leftover git conflict markers and keep the new
unified implementation that declares groupedRanges and isIdxScan and switches on
plan[0] to handle *physicalop.PhysicalIndexScan (set isIdxScan=true and
groupedRanges = x.GroupedRanges) and *physicalop.PhysicalTableScan
(groupedRanges = x.GroupedRanges); delete the old branch that used
distsql.SplitRangesAcrossInt64Boundary/TableHandleRangesToKVRanges and ensure
subsequent code uses the groupedRanges variable produced by this switch (look
for symbols groupedRanges, isIdxScan, and plan[0] in index_merge_reader.go).
- Around line 246-335: The file missing the physicalop import causes unresolved
references to *physicalop.PhysicalIndexScan and *physicalop.PhysicalTableScan in
IndexMergeReaderExecutor.buildPartialWorkerKVRanges; add the physicalop package
to the import block (import
"github.com/pingcap/tidb/pkg/planner/core/operator/physicalop" or the
repository's equivalent path) so the symbols PhysicalIndexScan and
PhysicalTableScan resolve, then run go build to verify no further missing
imports.
In
`@tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result`:
- Around line 3566-3627: The file still contains unresolved git conflict markers
(<<<<<<<, =======, >>>>>>>) around two blocks; open the region containing the
alternative "explain format = 'brief' ..." vs "explain format = 'plan_tree' ..."
outputs and remove the conflict markers and the discarded 'brief' branch
content, keeping the new plan_tree expectations (the blocks that show "explain
format = 'plan_tree' ...", "Projection", "IndexMerge", "Limit(Build)",
"IndexRangeScan", "TableRowIDScan") for both conflict regions so the golden
result contains only the plan_tree expectations and no leftover markers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dcf3ca02-1e86-4070-963e-c975de870863
📒 Files selected for processing (9)
pkg/executor/index_merge_reader.gopkg/executor/mem_reader.gopkg/planner/core/find_best_task.gopkg/planner/core/operator/physicalop/physical_index_scan.gotests/integrationtest/r/index_merge.resulttests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.resulttests/integrationtest/r/planner/core/grouped_ranges_order_by.resulttests/integrationtest/t/index_merge.testtests/integrationtest/t/planner/core/grouped_ranges_order_by.test
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release-8.5 #68753 +/- ##
================================================
Coverage ? 55.1171%
================================================
Files ? 1826
Lines ? 658024
Branches ? 0
================================================
Hits ? 362684
Misses ? 268258
Partials ? 27082
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…bot/cherry-pick-67771-to-release-8.5
|
/unhold |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qw4990, terry1purcell The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
…bot/cherry-pick-67771-to-release-8.5
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/planner/core/find_best_task.go`:
- Line 1377: Restore the Fix52869 gate when deciding preferMerge: the boolean
should require both the presence of index-merge hints and that the Fix52869
planner gate/feature flag is enabled, not just the hint alone; update the
preferMerge assignment (the line using ds.IndexMergeHints and variable
preferMerge in find_best_task.go) to combine the hint check with the existing
Fix52869 gate check (e.g., preferMerge := len(ds.IndexMergeHints) > 0 &&
Fix52869Enabled() or the equivalent project-specific gate function/flag) so the
behavior only changes when the kill‑switch is not set.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2f21c556-321c-455f-8bba-06f2a22b3aa7
📒 Files selected for processing (3)
pkg/planner/core/find_best_task.gotests/integrationtest/r/index_merge.resulttests/integrationtest/r/planner/core/grouped_ranges_order_by.result
✅ Files skipped from review due to trivial changes (1)
- tests/integrationtest/r/planner/core/grouped_ranges_order_by.result
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/planner/core/find_best_task.go`:
- Line 1377: Restore the Fix52869 gate when deciding preferMerge: the boolean
should require both the presence of index-merge hints and that the Fix52869
planner gate/feature flag is enabled, not just the hint alone; update the
preferMerge assignment (the line using ds.IndexMergeHints and variable
preferMerge in find_best_task.go) to combine the hint check with the existing
Fix52869 gate check (e.g., preferMerge := len(ds.IndexMergeHints) > 0 &&
Fix52869Enabled() or the equivalent project-specific gate function/flag) so the
behavior only changes when the kill‑switch is not set.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2f21c556-321c-455f-8bba-06f2a22b3aa7
📒 Files selected for processing (3)
pkg/planner/core/find_best_task.gotests/integrationtest/r/index_merge.resulttests/integrationtest/r/planner/core/grouped_ranges_order_by.result
✅ Files skipped from review due to trivial changes (1)
- tests/integrationtest/r/planner/core/grouped_ranges_order_by.result
🛑 Comments failed to post (1)
pkg/planner/core/find_best_task.go (1)
1377-1377:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winRestore the
Fix52869gate on hinted IndexMerge preference.This line changes
preferMergefrom a hint-plus-fix-control decision into a hint-only decision. That widens skyline-pruning behavior for everyUSE_INDEX_MERGEquery, not just the new ordered-INpartial-path cases in this PR, and removes the existing release-branch kill switch for regressions. Please keep the fix-control check here or split this planner-behavior change out.Suggested fix
- preferMerge := len(ds.IndexMergeHints) > 0 + preferMerge := len(ds.IndexMergeHints) > 0 && + fixcontrol.GetBoolWithDefault(ds.SCtx().GetSessionVars().OptimizerFixControl, fixcontrol.Fix52869, true)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/find_best_task.go` at line 1377, Restore the Fix52869 gate when deciding preferMerge: the boolean should require both the presence of index-merge hints and that the Fix52869 planner gate/feature flag is enabled, not just the hint alone; update the preferMerge assignment (the line using ds.IndexMergeHints and variable preferMerge in find_best_task.go) to combine the hint check with the existing Fix52869 gate check (e.g., preferMerge := len(ds.IndexMergeHints) > 0 && Fix52869Enabled() or the equivalent project-specific gate function/flag) so the behavior only changes when the kill‑switch is not set.
|
/retest |
1 similar comment
|
/retest |
This is an automated cherry-pick of #67771
What problem does this PR solve?
Issue Number: ref #65712 close #67775
Problem Summary:
For queries like
SELECT * FROM t WHERE (a = 1 OR b IN (1,2)) ORDER BY c LIMIT 5, where indexesidx_a_c(a, c)andidx_b_c(b, c)are available:a = 1partial path onidx_a_ccan directly satisfyORDER BY c(PropMatched).b IN (1,2)partial path onidx_b_ccannot directly satisfyORDER BY c, because the ranges[1,1]and[2,2]are individually ordered bycbut not globally ordered.Previously,
PropMatchedNeedMergeSort(introduced in #62694) was rejected for IndexMerge partial paths, causing the plan to fall back to a globalTopNwithoutLimitpushdown. This could scan all matching rows (e.g., ~750K) before sorting, resulting in much worse performance.This is Solution 3 described in the issue.
What changed and how does it work?
This PR leverages the merge sort infrastructure from #62694 to support
PropMatchedNeedMergeSortfor IndexMerge partial paths. The IN condition ranges are split into groups, each producing a sorted stream, which are then merged.After this fix, the plan becomes:
matchPropForIndexMergeAlternatives()andisMatchPropForIndexMerge(): AcceptPropMatchedNeedMergeSortfrom individual partial paths by using.Matched()instead of== PropMatched. The IndexMerge-level result remainsPropMatched(the IndexMerge itself satisfies the order via its own inter-path merge sort).ConvertToPartialIndexScan()andconvertToPartialTableScan(): Remove the assertion that blockedPropMatchedNeedMergeSort, and copyGroupedRanges/GroupByColIdxsfrom theAccessPathto the physical scan when the path needs merge sort.keyRangesandpartitionKeyRangesfields with a unifiedpartialWorkerKVRanges [][]*kvRangesWithPhysicalTblID. Each entry carries aPhysicalTableIDalongside its kv ranges, unifying partition mode, grouped ranges (from IN conditions), and the combination of both. This follows the same pattern asIndexLookUpExecutor.groupedKVRangesfrom planner, executor: support access pathkeep orderwithINconditions using merge sort #62694.buildPartialWorkerKVRanges()method inOpen()builds kv ranges for each partial plan considering both partitions and grouped ranges.rebuildRangeForCorCol(): After rebuilding ranges for correlated columns, also rebuildGroupedRangesusingGroupRangesByCols().startPartialTableWorker(): PassGroupedRanges/GroupByColIdxsto theTableReaderExecutor, which already has full support for grouped ranges.memIndexMergeReader(UnionScan): Updated to use the unifiedpartialWorkerKVRangesfield, replacing the previouspartitionKVRanges/partitionTablesand directkeyRangesaccess. ThePhysicalTableIDfrom each entry is used for partition handle conversion.Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes / Improvements
Tests