planner: backport partial order TopN optimization#68995
Conversation
|
Skipping CI for Draft Pull Request. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (22)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (19)
📝 WalkthroughWalkthroughAdds partial-ordered-index TopN support: converts the session/sysvar to ChangesPartial Ordered Index TopN Optimization
Dependency Updates and Test Infrastructure
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Skipping CI for Draft Pull Request. |
813b81e to
c0b9059
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (5)
DEPS.bzl (1)
5974-5980: Update tipb dependency: go.mod matches and Bazel metadata updates are present
DEPS.bzlupdatesgithub.com/pingcap/tipbtov0.0.0-20260605083900-f9f651ef5fbc, andgo.modcontains the same version.- The same change set includes
BUILD.bazelupdates alongsideDEPS.bzl, consistent with generated Bazel metadata (i.e.,make bazel_prepareoutput).- Ensure compilation/tests cover any
tipbAPI changes from commitf9f651ef5fbc(e.g.,ExchangeTypeusage and the referenced “Limit executor field”).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@DEPS.bzl` around lines 5974 - 5980, The tipb version bump may change generated protobuf types (e.g., enum/field renames like ExchangeType or addition of a Limit executor field); compile and run unit/integration tests (and run the Bazel prep step used by this repo) to surface errors, then search the codebase for references to tipb.ExchangeType and any places constructing/examining executor types or limit fields and update call sites to the new enum names/field accessors or message layout produced by the updated proto; ensure any BUILD/DEPS metadata is consistent with the new generated code and re-run tests to verify all API changes (ExchangeType usages and any Limit executor fields) are handled.pkg/planner/core/exhaust_physical_plans.go (2)
2269-2272: 💤 Low valueConsider validating ByItems earlier to avoid unnecessary pattern checks.
When
getPhysTopNWithPartialOrderPropertyreturnsnil(because ByItems contain non-column expressions at lines 2351-2356), the earlier pattern validation incanUsePartialOrder4TopNbecomes wasted work. Consider adding a ByItems column check tocanUsePartialOrder4TopN:♻️ Suggested refactor
func canUsePartialOrder4TopN(lt *logicalop.LogicalTopN) bool { if !lt.SCtx().GetSessionVars().IsPartialOrderedIndexForTopNEnabled() { return false } if len(lt.ByItems) == 0 { return false } + // Partial order only supports column-based ordering + for _, item := range lt.ByItems { + if _, ok := item.Expr.(*expression.Column); !ok { + return false + } + } return checkPartialOrderPattern(lt.Children()[0]) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/exhaust_physical_plans.go` around lines 2269 - 2272, canUsePartialOrder4TopN currently runs expensive pattern checks even when ByItems contain non-column expressions that will make getPhysTopNWithPartialOrderProperty return nil; update canUsePartialOrder4TopN to first validate lt.ByItems (or the ByItems slice used) contains only column expressions (e.g., ColumnExpr) and return false early if any non-column expression is found, so the later call to getPhysTopNWithPartialOrderProperty(lt, prop) is avoided when it would inevitably return nil.
2335-2346: 💤 Low valueAdd documentation explaining the supported pattern.
The supported pattern (DataSource → Selection → Projection chain with single children) is not obvious from the code. Per coding guidelines, comments should explain non-obvious intent and constraints.
📝 Suggested documentation
+// checkPartialOrderPattern validates that the logical plan tree matches the pattern +// supported by partial-order TopN optimization: a chain of DataSource, Selection, +// and/or Projection operators, where each operator has at most one child. func checkPartialOrderPattern(plan base.LogicalPlan) bool { switch p := plan.(type) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/exhaust_physical_plans.go` around lines 2335 - 2346, Add a short documentation comment above the checkPartialOrderPattern function that describes the exact supported pattern: a chain starting from logicalop.DataSource optionally wrapped by a single-child logicalop.LogicalSelection and/or a single-child logicalop.LogicalProjection (i.e., DataSource → Selection → Projection where each wrapper must have exactly one child). Mention the constraint that each LogicalSelection/LogicalProjection must have len(Children()) == 1 and that any other node type returns false; reference the function name checkPartialOrderPattern and the node types logicalop.DataSource, logicalop.LogicalSelection, and logicalop.LogicalProjection to make intent and limits clear.pkg/planner/core/operator/logicalop/logical_projection.go (2)
601-613: ⚡ Quick winAdd documentation and defensive bounds checking.
Same issues as
tryTransformSortItems: silently skips non-column/non-scalar-function expressions and lacks bounds checking.🛡️ Suggested improvements
+// tryTransformSortItemPtrs transforms sort item pointers by mapping their columns +// through the projection's expression list. Returns failure if any expression is a +// ScalarFunction (cannot push down). Constants and other non-column expressions +// are silently removed (sorting by constants is meaningless). func (p *LogicalProjection) tryTransformSortItemPtrs(items []*property.SortItem) ([]*property.SortItem, bool) { newItems := make([]*property.SortItem, 0, len(items)) for _, item := range items { idx := p.Schema().ColumnIndex(item.Col) + if idx < 0 { + // Column not found in schema; should not happen with valid properties + return nil, false + } switch expr := p.Exprs[idx].(type) { case *expression.Column: newItems = append(newItems, &property.SortItem{Col: expr, Desc: item.Desc}) case *expression.ScalarFunction: return nil, false + // Other types (Constant, CorrelatedColumn, etc.) are silently skipped } } return newItems, true }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/operator/logicalop/logical_projection.go` around lines 601 - 613, In tryTransformSortItemPtrs, add defensive bounds checking on idx from p.Schema().ColumnIndex(item.Col) (ensure 0 <= idx < len(p.Exprs)) and treat any expression types other than *expression.Column and *expression.ScalarFunction as a failure instead of silently skipping them; if the index is out of range or an unexpected expr type is encountered, return nil, false—otherwise, for a Column create the new SortItem as now and for a ScalarFunction return nil, false to match tryTransformSortItems behavior.
587-599: ⚡ Quick winAdd documentation and defensive bounds checking.
The function silently skips non-column, non-scalar-function expressions (e.g., constants), which is non-obvious behavior. Additionally,
ColumnIndexcan return -1 if the column is not found, causing a panic.🛡️ Suggested improvements
+// tryTransformSortItems transforms sort items by mapping their columns through +// the projection's expression list. Returns failure if any expression is a +// ScalarFunction (cannot push down). Constants and other non-column expressions +// are silently removed (sorting by constants is meaningless). func (p *LogicalProjection) tryTransformSortItems(items []property.SortItem) ([]property.SortItem, bool) { newItems := make([]property.SortItem, 0, len(items)) for _, item := range items { idx := p.Schema().ColumnIndex(item.Col) + if idx < 0 { + // Column not found in schema; should not happen with valid properties + return nil, false + } switch expr := p.Exprs[idx].(type) { case *expression.Column: newItems = append(newItems, property.SortItem{Col: expr, Desc: item.Desc}) case *expression.ScalarFunction: return nil, false + // Other types (Constant, CorrelatedColumn, etc.) are silently skipped } } return newItems, true }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/operator/logicalop/logical_projection.go` around lines 587 - 599, The tryTransformSortItems function currently can panic when p.Schema().ColumnIndex returns -1 and silently ignores non-column/non-scalar expressions; update tryTransformSortItems to first check idx := p.Schema().ColumnIndex(item.Col) and if idx < 0 return nil, false to avoid panic, and treat any expression that is not *expression.Column as a non-transformable case (i.e., return nil, false) instead of silently skipping (handle the default case explicitly), and add a brief comment above tryTransformSortItems describing that it only succeeds when every sort item maps to a plain Column in p.Exprs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/planner/core/find_best_task.go`:
- Around line 1401-1415: The review points out that skylinePruning()/the loop is
mutating the shared AccessPath by setting path.ForcePartialOrder = true which
later causes convertToIndexScan() to reject paths when prop.PartialOrderInfo ==
nil; to fix, avoid persisting ForcePartialOrder on the shared
ds.PossibleAccessPaths: either operate on a shallow copy of the AccessPath or
use a candidate-local boolean (e.g., localForcePartialOrder) when
evaluating/constructing currentCandidate in getTableCandidate/skylinePruning
rather than writing to path.ForcePartialOrder, and if you must modify path,
ensure you clear/reset path.ForcePartialOrder before finishing each pruning pass
so convertToIndexScan() sees the original state.
In `@pkg/planner/core/physical_plans.go`:
- Around line 1331-1332: PhysicalTopN.MemoryUsage() and
PhysicalLimit.MemoryUsage() compute scalar sizes for new prefix metadata but
omit the memory cost of the PrefixCol object itself; update both methods to,
when PrefixCol is non-nil, add the result of PrefixCol.MemoryUsage() to the sum
(e.g., after computing sum use "if lt.PrefixCol != nil { sum +=
lt.PrefixCol.MemoryUsage() }" or the equivalent using the local
receiver/variable name) so the PrefixCol object's memory is included in total
accounting.
In `@pkg/planner/core/task_base.go`:
- Around line 264-265: CopTask.MemoryUsage currently omits the memory held by
the new partialOrderMatchResult field and still uses the old pointer count;
update the CopTask.MemoryUsage method to include the size of
partialOrderMatchResult (handle nil check) and account for its nested payload by
adding the estimated memory for property.PartialOrderMatchResult (including any
slices/maps/inner structs it owns) to the total, and increment the
pointer/object count accordingly so the reported memory reflects the nested
match-result payload.
In `@pkg/planner/core/task.go`:
- Around line 1112-1115: estimateMaxXForPartialOrder currently returns 0 causing
COST mode to under-estimate partial-order cop limits; update the block that
computes maxX/estimatedRows (where estimateMaxXForPartialOrder, maxX,
partialOrderedLimit, estimatedRows, copTask and childProfile are used) to use a
conservative fallback instead of zero: if estimateMaxXForPartialOrder yields a
non-positive or placeholder value, set maxX to the child plan's row count from
childProfile (e.g., the stats row count) or another conservative upper bound,
then compute estimatedRows = float64(partialOrderedLimit) + float64(maxX);
ensure the same change is applied at the other occurrence around lines 1130-1132
so DeriveLimitStats gets a conservative estimate.
In `@pkg/planner/property/physical_property.go`:
- Around line 291-295: Update PhysicalProperty.MemoryUsage to account for the
newly retained PartialOrderInfo: inside the MemoryUsage implementation
(function/method PhysicalProperty.MemoryUsage) check if p.PartialOrderInfo is
non-nil and include its memory footprint and the footprints of each entry in
p.PartialOrderInfo.SortItems (iterate SortItems and add per-item memory cost
similar to how other slice/field entries are counted). Ensure you include the
pointer/struct overhead for PartialOrderInfo and any per-sort-item allocations
so the total memory reflects the partial-order property.
In `@pkg/sessionctx/variable/sysvar.go`:
- Around line 2904-2913: Add an upgrade migration that rewrites legacy
boolean-stored values ("ON"/"OFF"/"1"/"0") for the
TiDBOptPartialOrderedIndexForTopN sysvar into the new enum values
("COST"/"DISABLE") before global variables are loaded; implement it alongside
other migrations (e.g., in the upgradeToVer228-style bootstrap step) so
mysql.global_variables rows are materialized/updated prior to sysvar validation,
update the migration to target the TiDBOptPartialOrderedIndexForTopN name, and
add an upgrade test that simulates an upgraded cluster with legacy persisted
values to assert the value is rewritten and accepted by the
Validation/SetSession logic.
In `@tests/integrationtest/t/planner/core/partial_order_topn.test`:
- Line 406: The test uses FORCE_INDEX(t_prefix_len, long_prefix) but
"long_prefix" is a column, not an index; update the hint to reference the real
index name (e.g., FORCE_INDEX(t_prefix_len, <actual_index_name>)) or
create/rename the index on t_prefix_len to match the hint before the query, and
then re-run and record the regenerated deterministic results; locate the
FORCE_INDEX usage in the test and replace the second argument with the actual
index identifier used by the table (or add an index creation step for that
identifier) so the planner path is reliably forced.
---
Nitpick comments:
In `@DEPS.bzl`:
- Around line 5974-5980: The tipb version bump may change generated protobuf
types (e.g., enum/field renames like ExchangeType or addition of a Limit
executor field); compile and run unit/integration tests (and run the Bazel prep
step used by this repo) to surface errors, then search the codebase for
references to tipb.ExchangeType and any places constructing/examining executor
types or limit fields and update call sites to the new enum names/field
accessors or message layout produced by the updated proto; ensure any BUILD/DEPS
metadata is consistent with the new generated code and re-run tests to verify
all API changes (ExchangeType usages and any Limit executor fields) are handled.
In `@pkg/planner/core/exhaust_physical_plans.go`:
- Around line 2269-2272: canUsePartialOrder4TopN currently runs expensive
pattern checks even when ByItems contain non-column expressions that will make
getPhysTopNWithPartialOrderProperty return nil; update canUsePartialOrder4TopN
to first validate lt.ByItems (or the ByItems slice used) contains only column
expressions (e.g., ColumnExpr) and return false early if any non-column
expression is found, so the later call to
getPhysTopNWithPartialOrderProperty(lt, prop) is avoided when it would
inevitably return nil.
- Around line 2335-2346: Add a short documentation comment above the
checkPartialOrderPattern function that describes the exact supported pattern: a
chain starting from logicalop.DataSource optionally wrapped by a single-child
logicalop.LogicalSelection and/or a single-child logicalop.LogicalProjection
(i.e., DataSource → Selection → Projection where each wrapper must have exactly
one child). Mention the constraint that each LogicalSelection/LogicalProjection
must have len(Children()) == 1 and that any other node type returns false;
reference the function name checkPartialOrderPattern and the node types
logicalop.DataSource, logicalop.LogicalSelection, and
logicalop.LogicalProjection to make intent and limits clear.
In `@pkg/planner/core/operator/logicalop/logical_projection.go`:
- Around line 601-613: In tryTransformSortItemPtrs, add defensive bounds
checking on idx from p.Schema().ColumnIndex(item.Col) (ensure 0 <= idx <
len(p.Exprs)) and treat any expression types other than *expression.Column and
*expression.ScalarFunction as a failure instead of silently skipping them; if
the index is out of range or an unexpected expr type is encountered, return nil,
false—otherwise, for a Column create the new SortItem as now and for a
ScalarFunction return nil, false to match tryTransformSortItems behavior.
- Around line 587-599: The tryTransformSortItems function currently can panic
when p.Schema().ColumnIndex returns -1 and silently ignores
non-column/non-scalar expressions; update tryTransformSortItems to first check
idx := p.Schema().ColumnIndex(item.Col) and if idx < 0 return nil, false to
avoid panic, and treat any expression that is not *expression.Column as a
non-transformable case (i.e., return nil, false) instead of silently skipping
(handle the default case explicitly), and add a brief comment above
tryTransformSortItems describing that it only succeeds when every sort item maps
to a plain Column in p.Exprs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2989e3e2-3f0d-416f-88f2-9ce3b26f05ac
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (22)
DEPS.bzlgo.modpkg/planner/core/casetest/rule/BUILD.bazelpkg/planner/core/exhaust_physical_plans.gopkg/planner/core/explain.gopkg/planner/core/find_best_task.gopkg/planner/core/hint_test.gopkg/planner/core/operator/logicalop/logical_projection.gopkg/planner/core/physical_plans.gopkg/planner/core/plan_clone_generated.gopkg/planner/core/plan_to_pb.gopkg/planner/core/task.gopkg/planner/core/task_base.gopkg/planner/property/physical_property.gopkg/planner/util/path.gopkg/sessionctx/variable/session.gopkg/sessionctx/variable/session_test.gopkg/sessionctx/variable/sysvar.gopkg/sessionctx/variable/sysvar_test.gopkg/sessionctx/variable/tidb_vars.gotests/integrationtest/r/planner/core/partial_order_topn.resulttests/integrationtest/t/planner/core/partial_order_topn.test
| if prop.PartialOrderInfo != nil { | ||
| continue | ||
| } | ||
| currentCandidate = getTableCandidate(ds, path, prop) | ||
| } else { | ||
| if !(len(path.AccessConds) > 0 || !prop.IsSortItemEmpty() || path.Forced || path.IsSingleScan) { | ||
| var matchPartialOrderIndex bool | ||
| if ds.SCtx().GetSessionVars().IsPartialOrderedIndexForTopNEnabled() && | ||
| prop.PartialOrderInfo != nil { | ||
| if !matchPartialOrderProperty(path, prop.PartialOrderInfo).Matched { | ||
| continue | ||
| } | ||
| matchPartialOrderIndex = true | ||
| if path.Forced && !path.ForceNoKeepOrder { | ||
| path.ForcePartialOrder = true | ||
| } |
There was a problem hiding this comment.
Don't persist ForcePartialOrder on the shared AccessPath.
skylinePruning() sets path.ForcePartialOrder = true in-place, but ds.PossibleAccessPaths are reused across later property explorations. After one partial-order pass marks a forced path here, a later normal pass can hit the prop.PartialOrderInfo == nil rejection in convertToIndexScan() and incorrectly discard that hinted path. Keep this flag candidate-local, or at least clear it before each pruning pass.
Minimal safe reset
for _, path := range ds.PossibleAccessPaths {
+ path.ForcePartialOrder = false
// We should check whether the possible access path is valid first.
if path.StoreType != kv.TiFlash && prop.IsFlashProp() {
continue
}Also applies to: 2401-2404
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/core/find_best_task.go` around lines 1401 - 1415, The review
points out that skylinePruning()/the loop is mutating the shared AccessPath by
setting path.ForcePartialOrder = true which later causes convertToIndexScan() to
reject paths when prop.PartialOrderInfo == nil; to fix, avoid persisting
ForcePartialOrder on the shared ds.PossibleAccessPaths: either operate on a
shallow copy of the AccessPath or use a candidate-local boolean (e.g.,
localForcePartialOrder) when evaluating/constructing currentCandidate in
getTableCandidate/skylinePruning rather than writing to path.ForcePartialOrder,
and if you must modify path, ensure you clear/reset path.ForcePartialOrder
before finishing each pruning pass so convertToIndexScan() sees the original
state.
| sum = lt.BasePhysicalPlan.MemoryUsage() + size.SizeOfSlice + int64(cap(lt.ByItems))*size.SizeOfPointer + | ||
| size.SizeOfUint64*2 + size.SizeOfInt64 + size.SizeOfInt |
There was a problem hiding this comment.
Include PrefixCol object cost in memory accounting.
Both PhysicalTopN.MemoryUsage() and PhysicalLimit.MemoryUsage() include scalar field sizes for the new prefix metadata, but they do not include PrefixCol.MemoryUsage() when set.
Suggested patch
func (lt *PhysicalTopN) MemoryUsage() (sum int64) {
@@
- sum = lt.BasePhysicalPlan.MemoryUsage() + size.SizeOfSlice + int64(cap(lt.ByItems))*size.SizeOfPointer +
- size.SizeOfUint64*2 + size.SizeOfInt64 + size.SizeOfInt
+ sum = lt.BasePhysicalPlan.MemoryUsage() + size.SizeOfSlice + int64(cap(lt.ByItems))*size.SizeOfPointer +
+ size.SizeOfUint64*2 + size.SizeOfPointer + size.SizeOfInt
@@
for _, item := range lt.PartitionBy {
sum += item.MemoryUsage()
}
+ if lt.PrefixCol != nil {
+ sum += lt.PrefixCol.MemoryUsage()
+ }
return
}
@@
func (p *PhysicalLimit) MemoryUsage() (sum int64) {
@@
- sum = p.physicalSchemaProducer.MemoryUsage() + size.SizeOfUint64*2 + size.SizeOfInt64 + size.SizeOfInt
+ sum = p.physicalSchemaProducer.MemoryUsage() + size.SizeOfUint64*2 + size.SizeOfPointer + size.SizeOfInt
+ if p.PrefixCol != nil {
+ sum += p.PrefixCol.MemoryUsage()
+ }
return
}Also applies to: 2118-2118
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/core/physical_plans.go` around lines 1331 - 1332,
PhysicalTopN.MemoryUsage() and PhysicalLimit.MemoryUsage() compute scalar sizes
for new prefix metadata but omit the memory cost of the PrefixCol object itself;
update both methods to, when PrefixCol is non-nil, add the result of
PrefixCol.MemoryUsage() to the sum (e.g., after computing sum use "if
lt.PrefixCol != nil { sum += lt.PrefixCol.MemoryUsage() }" or the equivalent
using the local receiver/variable name) so the PrefixCol object's memory is
included in total accounting.
| // partialOrderMatchResult stores the match result for partial order optimization. | ||
| partialOrderMatchResult *property.PartialOrderMatchResult |
There was a problem hiding this comment.
Update CopTask.MemoryUsage for partialOrderMatchResult.
CopTask now retains partialOrderMatchResult, but MemoryUsage() still uses the old pointer count and does not include the nested match-result payload.
Suggested patch
func (t *CopTask) MemoryUsage() (sum int64) {
@@
- sum = size.SizeOfInterface*(2+int64(cap(t.idxMergePartPlans)+cap(t.rootTaskConds))) + size.SizeOfBool*3 + size.SizeOfUint64 +
- size.SizeOfPointer*(3+int64(cap(t.commonHandleCols)+cap(t.tblCols))) + size.SizeOfSlice*4 + t.physPlanPartInfo.MemoryUsage()
+ sum = size.SizeOfInterface*(2+int64(cap(t.idxMergePartPlans)+cap(t.rootTaskConds))) + size.SizeOfBool*3 + size.SizeOfUint64 +
+ size.SizeOfPointer*(4+int64(cap(t.commonHandleCols)+cap(t.tblCols))) + size.SizeOfSlice*4 + t.physPlanPartInfo.MemoryUsage()
@@
for _, expr := range t.rootTaskConds {
sum += expr.MemoryUsage()
}
+ if t.partialOrderMatchResult != nil {
+ sum += size.SizeOfBool + size.SizeOfPointer + size.SizeOfInt
+ if t.partialOrderMatchResult.PrefixCol != nil {
+ sum += t.partialOrderMatchResult.PrefixCol.MemoryUsage()
+ }
+ }
return
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // partialOrderMatchResult stores the match result for partial order optimization. | |
| partialOrderMatchResult *property.PartialOrderMatchResult | |
| func (t *CopTask) MemoryUsage() (sum int64) { | |
| sum = size.SizeOfInterface*(2+int64(cap(t.idxMergePartPlans)+cap(t.rootTaskConds))) + size.SizeOfBool*3 + size.SizeOfUint64 + | |
| size.SizeOfPointer*(4+int64(cap(t.commonHandleCols)+cap(t.tblCols))) + size.SizeOfSlice*4 + t.physPlanPartInfo.MemoryUsage() | |
| for _, expr := range t.rootTaskConds { | |
| sum += expr.MemoryUsage() | |
| } | |
| if t.partialOrderMatchResult != nil { | |
| sum += size.SizeOfBool + size.SizeOfPointer + size.SizeOfInt | |
| if t.partialOrderMatchResult.PrefixCol != nil { | |
| sum += t.partialOrderMatchResult.PrefixCol.MemoryUsage() | |
| } | |
| } | |
| return | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/core/task_base.go` around lines 264 - 265, CopTask.MemoryUsage
currently omits the memory held by the new partialOrderMatchResult field and
still uses the old pointer count; update the CopTask.MemoryUsage method to
include the size of partialOrderMatchResult (handle nil check) and account for
its nested payload by adding the estimated memory for
property.PartialOrderMatchResult (including any slices/maps/inner structs it
owns) to the total, and increment the pointer/object count accordingly so the
reported memory reflects the nested match-result payload.
| maxX := estimateMaxXForPartialOrder(p.SCtx(), copTask) | ||
| estimatedRows := float64(partialOrderedLimit) + float64(maxX) | ||
| childProfile := copTask.indexPlan.StatsInfo() | ||
| limitStats := util.DeriveLimitStats(childProfile, estimatedRows) |
There was a problem hiding this comment.
estimateMaxXForPartialOrder() is still a stub, so COST mode underprices this plan.
The partial-order cop limit can read more than Count+Offset rows from the last prefix bucket, but estimateMaxXForPartialOrder() always returns 0. That makes limitStats and the downstream lookup/network work look cheaper than they are, so tidb_opt_partial_ordered_index_for_topn=COST can pick this path too aggressively. Please compute a conservative extra-row bound here, or fall back to the child row count until the estimator exists.
Temporary conservative fallback
func estimateMaxXForPartialOrder(_ base.PlanContext, _ *CopTask) uint64 {
- return 0
+ if copTask == nil || copTask.indexPlan == nil || copTask.indexPlan.StatsInfo() == nil {
+ return 0
+ }
+ return uint64(math.Ceil(copTask.indexPlan.StatsInfo().RowCount))
}Also applies to: 1130-1132
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/core/task.go` around lines 1112 - 1115,
estimateMaxXForPartialOrder currently returns 0 causing COST mode to
under-estimate partial-order cop limits; update the block that computes
maxX/estimatedRows (where estimateMaxXForPartialOrder, maxX,
partialOrderedLimit, estimatedRows, copTask and childProfile are used) to use a
conservative fallback instead of zero: if estimateMaxXForPartialOrder yields a
non-positive or placeholder value, set maxX to the child plan's row count from
childProfile (e.g., the stats row count) or another conservative upper bound,
then compute estimatedRows = float64(partialOrderedLimit) + float64(maxX);
ensure the same change is applied at the other occurrence around lines 1130-1132
so DeriveLimitStats gets a conservative estimate.
| // PartialOrderInfo is used for TopN's partial order optimization. | ||
| // When this field is not nil, it indicates that prefix index can be used | ||
| // to provide partial order for TopN. | ||
| PartialOrderInfo *PartialOrderInfo | ||
| } |
There was a problem hiding this comment.
Account for PartialOrderInfo in PhysicalProperty.MemoryUsage.
PartialOrderInfo is newly retained on PhysicalProperty, but MemoryUsage() does not include it (or its SortItems), so memory tracking undercounts partial-order properties.
Suggested patch
func (p *PhysicalProperty) MemoryUsage() (sum int64) {
@@
for _, mppCol := range p.MPPPartitionCols {
sum += mppCol.MemoryUsage()
}
+ if p.PartialOrderInfo != nil {
+ sum += size.SizeOfPointer + size.SizeOfSlice + int64(cap(p.PartialOrderInfo.SortItems))*size.SizeOfPointer
+ for _, item := range p.PartialOrderInfo.SortItems {
+ if item != nil {
+ sum += item.MemoryUsage()
+ }
+ }
+ }
return
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/property/physical_property.go` around lines 291 - 295, Update
PhysicalProperty.MemoryUsage to account for the newly retained PartialOrderInfo:
inside the MemoryUsage implementation (function/method
PhysicalProperty.MemoryUsage) check if p.PartialOrderInfo is non-nil and include
its memory footprint and the footprints of each entry in
p.PartialOrderInfo.SortItems (iterate SortItems and add per-item memory cost
similar to how other slice/field entries are counted). Ensure you include the
pointer/struct overhead for PartialOrderInfo and any per-sort-item allocations
so the total memory reflects the partial-order property.
| explain format='brief' select /*+ force_index(t_prefix_len, idx_short) */ * from t_prefix_len order by short_prefix limit 3; | ||
|
|
||
| # Long prefix (50 bytes) | ||
| explain format='brief' select /*+ force_index(t_prefix_len, long_prefix) */ * from t_prefix_len order by long_prefix limit 3; |
There was a problem hiding this comment.
Fix the FORCE_INDEX target to an actual index.
Line 406 uses force_index(t_prefix_len, long_prefix), but long_prefix is a column name, not an index name. This means the test is not actually forcing the intended index path and can become non-deterministic.
🔧 Proposed fix
--- a/tests/integrationtest/t/planner/core/partial_order_topn.test
+++ b/tests/integrationtest/t/planner/core/partial_order_topn.test
@@
-explain format='brief' select /*+ force_index(t_prefix_len, long_prefix) */ * from t_prefix_len order by long_prefix limit 3;
+explain format='brief' select /*+ force_index(t_prefix_len, idx_long) */ * from t_prefix_len order by long_prefix limit 3;--- a/tests/integrationtest/r/planner/core/partial_order_topn.result
+++ b/tests/integrationtest/r/planner/core/partial_order_topn.result
@@
-explain format='brief' select /*+ force_index(t_prefix_len, long_prefix) */ * from t_prefix_len order by long_prefix limit 3;
+explain format='brief' select /*+ force_index(t_prefix_len, idx_long) */ * from t_prefix_len order by long_prefix limit 3;As per coding guidelines, test files should keep changes deterministic, and integration tests should record verified regenerated results.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/integrationtest/t/planner/core/partial_order_topn.test` at line 406,
The test uses FORCE_INDEX(t_prefix_len, long_prefix) but "long_prefix" is a
column, not an index; update the hint to reference the real index name (e.g.,
FORCE_INDEX(t_prefix_len, <actual_index_name>)) or create/rename the index on
t_prefix_len to match the hint before the query, and then re-run and record the
regenerated deterministic results; locate the FORCE_INDEX usage in the test and
replace the second argument with the actual index identifier used by the table
(or add an index creation step for that identifier) so the planner path is
reliably forced.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release-8.5 #68995 +/- ##
================================================
Coverage ? 55.1381%
================================================
Files ? 1826
Lines ? 659095
Branches ? 0
================================================
Hits ? 363413
Misses ? 268611
Partials ? 27071
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: qw4990 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: ref #63280, ref #65813
Problem Summary:
Backport the partial ordered index / TopN optimizer changes to
release-8.5.Source PRs:
This also updates
github.com/pingcap/tipbtof9f651ef5fbc4dcf31c452ab701ca3f67c6c99a8, which contains theLimitexecutor field needed by the optimizer backport.What changed and how does it work?
This is a manual reconstruction from the source PRs instead of a direct cherry-pick, because
release-8.5still uses the older planner physical operator layout and the merge commits conflict heavily with the current master layout.CopMultiReadTaskTypeTopN alternative carrying partial-order requirements.CopTask.Limitto the cop index plan when possible, and keep the rootTopN.ORDER_INDEX/NO_ORDER_INDEXbehavior with partial order.tidb_opt_partial_ordered_index_for_topnto enum valuesDISABLE/COST.Check List
Tests
Commands:
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Configuration Changes
Tests