planner, statistics: use selected partition stats for dynamic pruning#66920
planner, statistics: use selected partition stats for dynamic pruning#66920hawkingrei wants to merge 13 commits into
Conversation
|
Review failed due to infrastructure/execution failure after retries. Please re-trigger review. Note: 1 P1 issue was found and posted by the review_code agent before the failure (see inline comment on ℹ️ Learn more details on Pantheon AI. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDetects when planning can determine selected partitions, records static-pruned partition IDs, skips cached plans that depend on partition-specific pruning, and merges selected-partition statistics at runtime to improve cardinality estimates while keeping dynamic partition pruning. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as rgba(66,133,244,0.5) Client
participant Planner as rgba(52,168,83,0.5) Planner
participant PlanCache as rgba(251,188,5,0.5) PlanCache
participant Stats as rgba(234,67,53,0.5) Stats
participant Executor as rgba(153,102,255,0.5) Executor
Client->>Planner: Submit query
Planner->>PlanCache: Check for cached plan
alt Cached plan would rely on static-prune
PlanCache-->>Planner: Indicate skip
Planner->>Planner: Run partition processor → set StaticPrunedPartitionIDs
Planner->>Stats: Request merged stats for selected partitions
Stats-->>Planner: Return merged runtime stats
Planner->>Executor: Build/emit plan using merged stats (dynamic prune)
else Cached plan usable
PlanCache-->>Planner: Return cached plan
Planner->>Executor: Execute cached plan
end
Executor->>Client: Return results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
pkg/planner/core/planbuilder.go (1)
1493-1496: Update the pruning comment to describe the new dynamic-path exception.The code now keeps global index paths when dynamic partition pruning is active, but the comment still says they must be ignored whenever the partition processor is enabled. That makes this branch look accidental and is easy to regress later. As per coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees, SQL/compatibility contracts, or important performance trade-offs, and SHOULD NOT restate what the code already makes clear".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/planbuilder.go` around lines 1493 - 1496, The existing comment by the branch that uses hasFlagPartitionProcessor and ctx.GetSessionVars().StmtCtx.UseDynamicPartitionPrune() is stale—update it to state that global index paths are normally excluded because LogicalPartitionAll is not suitable for global indexes, but there is an explicit exception: when dynamic partition pruning (StmtCtx.UseDynamicPartitionPrune()) is enabled we retain dynamic (global index) paths instead of removing them; keep the reference to removeGlobalIndexPaths(available) and explain this dynamic-path exception and the rationale so future readers understand why the branch condition keeps global index paths when dynamic pruning is active.pkg/planner/core/casetest/planstats/plan_stats_test.go (1)
639-690: Assert a histogram-driven estimate, not onlyRealtimeCount.
reader.StatsInfo().HistColl.RealtimeCount == 200proves the selected partitions were counted, but it doesn't prove the column/index stats were actually merged. A regression that still sums partition counts while reusing the global histogram would pass here. Please add one assertion on an estimate/NDV driven by a skewed predicate so this test guards the cardinality-estimation path the PR is fixing. As per coding guidelines, "MUST add a regression test and verify it fails before fix and passes after fix for bug fixes".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/casetest/planstats/plan_stats_test.go` around lines 639 - 690, Test currently only checks RealtimeCount; additionally parse/optimize a query with a skewed predicate (e.g., "select * from pt where a = 0" or "a < 1") inside TestDynamicPartitionPruneUsesMergedPartitionStats and assert a histogram-driven estimate/NDV from the produced plan (use reader.StatsInfo().HistColl estimate/NDV APIs on the PhysicalTableReader returned by planner.Optimize) that would differ from a naive summed-partition/global-only estimate; this verifies the merged histogram is actually used rather than just RealtimeCount.pkg/planner/core/stats/stats.go (1)
54-60: Consider updating the function documentation.The function signature now includes
partitionIDs []int64, but the documentation comment doesn't explain its purpose. Adding a brief description would help future maintainers understand when and why to pass partition IDs versus nil.Suggested documentation addition
// GetStatsTable gets statistics information for a table specified by "tableID". // A pseudo statistics table is returned in any of the following scenario: // 1. tidb-server started and statistics handle has not been initialized. // 2. table row count from statistics is zero. // 3. statistics is outdated. +// When partitionIDs is non-empty, statistics from those partitions are merged +// for cardinality estimation instead of using global partition table stats. // Note: please also update getLatestVersionFromStatsTable() when logic in this function changes. func GetStatsTable(ctx base.PlanContext, tblInfo *model.TableInfo, pid int64, partitionIDs []int64) *statistics.Table {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/stats/stats.go` around lines 54 - 60, The GetStatsTable comment is out of sync with the new partitionIDs parameter; update the function docstring for GetStatsTable to briefly explain the partitionIDs []int64 argument (what it represents, when to pass non-nil vs nil, and how it affects the returned pseudo statistics), and also add a short note reminding maintainers to update getLatestVersionFromStatsTable() if this logic changes, keeping the style/placement consistent with the existing comment block.pkg/planner/core/tests/prepare/prepare_test.go (1)
1597-1597: Loosen theEXPLAINassertions to operator kinds only.These checks still hardcode planner-generated node IDs and tree glyphs, so benign plan renumbering will break the regression without changing behavior.
♻️ Suggested assertion shape
- require.Equal(t, "TableDual_8", explain.Rows()[0][0]) + flatExplain := fmt.Sprint(explain.Rows()) + require.Contains(t, flatExplain, "TableDual") tk.MustExec(`set `@a`=-5, `@b`=112`) tk.MustQuery(`execute stmt using `@a`,`@b``).Check(testkit.Rows("-5 7")) explain = tkExplain.MustQuery(fmt.Sprintf("explain for connection %d", tkProcess.ID)) if pruneMode == string(variable.Dynamic) { - explain.CheckAt([]int{0}, - [][]any{ - {"Selection_9"}, - {"└─Point_Get_8"}, - }) + flatExplain = fmt.Sprint(explain.Rows()) + require.Contains(t, flatExplain, "Selection") + require.Contains(t, flatExplain, "Point_Get")As per coding guidelines,
**/{*_test.go,testdata/**}: Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required.Also applies to: 1603-1607
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/tests/prepare/prepare_test.go` at line 1597, The test is asserting planner node IDs/glyphs verbatim (e.g. require.Equal(t, "TableDual_8", explain.Rows()[0][0])) which breaks on benign renumbering; change these assertions to check only the operator kind instead of the full planner-generated identifier: use the explain.Rows() result for the relevant rows (the same locations around the current assertions and the other cases at lines 1603-1607), parse or match the cell string to extract the operator kind (e.g., via strings.HasPrefix, strings.Contains, or a small regex) and assert that it equals or contains the expected kind (like "TableDual") using require.Contains/require.Equal on that extracted kind so the test becomes deterministic and tolerant of ID/glyph changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/casetest/plancache/plan_cache_partition_table_test.go`:
- Around line 333-345: The test only accepts the static partition-prune warning
when a prepared PointGet skips plan cache; update the assertion so it also
accepts the prepared PointGet over-optimized warning ("Batch/PointGet plans may
be over-optimized"). Specifically, in the branch that checks
!tk.Session().GetSessionVars().FoundInPlanCache (where
requireStaticPartitionPruneWarning is called), change the check to allow either
requireStaticPartitionPruneWarning OR an over-optimized warning match (e.g.,
test the string returned by tk.MustQuery(`show warnings ` +
comment).Rows()[0][2].(string) for either message). Keep the existing else
branch logic (tkProcess, sessmgr.ProcessInfo, res.MultiCheckContain /
res.CheckNotContain) unchanged.
In `@pkg/planner/core/plan_cache_utils.go`:
- Around line 192-196: The warning message is misleading because the branch
handles non-prepared statements (checks !isPrepStmt); update the warning text in
the block that uses cacheable, isPrepStmt, staticPartitionPrune and reason so it
accurately states it's skipping the non-prepared-plan cache (e.g., "skip
non-prepared plan-cache: "+reason or "skip plan-cache for non-prepared
statement: "+reason) when calling sctx.GetSessionVars().StmtCtx.AppendWarning;
keep the rest of the logic unchanged.
In `@pkg/planner/core/plan_cache.go`:
- Around line 333-341: The current shouldSkipCachedPlanForStaticPartitionPruning
only checks the top-level plan type and misses DML roots (UPDATE/DELETE) that
embed the partitioned scan in their SelectPlan; update
shouldSkipCachedPlanForStaticPartitionPruning to unwrap DML roots (e.g., detect
and extract the inner SelectPlan from UPDATE/DELETE plan nodes) before asserting
base.PhysicalPlan and calling cachedPlanUsesStaticPartitionPruning so prepared
DML cache entries are checked the same way as plain SELECT plans.
- Around line 310-315: The miss-path must mirror the hit-path by marking the
statement to skip plan cache when static partition pruning is detected: before
writing a plan into the cache (the code path where PartitionProcessor finishes
optimization and the cache entry is created), run the same detection used on
hits (or call shouldSkipCachedPlanForStaticPartitionPruning with the session
context and the plan) and if it returns skip==true call
stmtCtx.SetSkipPlanCache("Static partition pruning mode") and avoid writing the
cache entry; ensure you reference PartitionProcessor’s miss-path cache-write
location and the same symbols shouldSkipCachedPlanForStaticPartitionPruning and
stmtCtx.SetSkipPlanCache("Static partition pruning mode") so the check is
symmetric with the hit-path.
In `@pkg/statistics/table.go`:
- Around line 745-757: The merge loop over partitionStats must reject pseudo
partitions so they don't contaminate merged runtime stats: inside the for _,
partStats := range partitionStats loop (in pkg/statistics/table.go) check the
partition's pseudo marker (e.g., a field on partStats such as IsPseudo,
IsPseudoTable, or similar) and if any partition is pseudo, abort by returning
nil, false before accumulating counts/versions; do not fold pseudo placeholder
counts or LastAnalyzeVersion into the merged result. Ensure the early-return
uses the same return shape as the current function so callers treat the result
as absent.
- Around line 771-791: The ColAndIdxExistenceMap updates are incorrectly gated
on successful in-memory merge causing Has=false for stats that exist on disk;
always mark existence regardless of the merge helper's boolean result. In the
loop over tblInfo.Columns and tblInfo.Indices, call
merged.ColAndIdxExistenceMap.InsertCol(colInfo.ID, true) and
merged.ColAndIdxExistenceMap.InsertIndex(idxInfo.ID, true) unconditionally
(i.e., before or independent of the boolean check returned by
mergePartitionColumnStatsForRuntime and mergePartitionIndexStatsForRuntime), but
only call merged.SetCol(colInfo.ID, colStats) / merged.SetIdx(idxInfo.ID,
idxStats) when ok is true so in-memory merge behavior remains the same.
---
Nitpick comments:
In `@pkg/planner/core/casetest/planstats/plan_stats_test.go`:
- Around line 639-690: Test currently only checks RealtimeCount; additionally
parse/optimize a query with a skewed predicate (e.g., "select * from pt where a
= 0" or "a < 1") inside TestDynamicPartitionPruneUsesMergedPartitionStats and
assert a histogram-driven estimate/NDV from the produced plan (use
reader.StatsInfo().HistColl estimate/NDV APIs on the PhysicalTableReader
returned by planner.Optimize) that would differ from a naive
summed-partition/global-only estimate; this verifies the merged histogram is
actually used rather than just RealtimeCount.
In `@pkg/planner/core/planbuilder.go`:
- Around line 1493-1496: The existing comment by the branch that uses
hasFlagPartitionProcessor and
ctx.GetSessionVars().StmtCtx.UseDynamicPartitionPrune() is stale—update it to
state that global index paths are normally excluded because LogicalPartitionAll
is not suitable for global indexes, but there is an explicit exception: when
dynamic partition pruning (StmtCtx.UseDynamicPartitionPrune()) is enabled we
retain dynamic (global index) paths instead of removing them; keep the reference
to removeGlobalIndexPaths(available) and explain this dynamic-path exception and
the rationale so future readers understand why the branch condition keeps global
index paths when dynamic pruning is active.
In `@pkg/planner/core/stats/stats.go`:
- Around line 54-60: The GetStatsTable comment is out of sync with the new
partitionIDs parameter; update the function docstring for GetStatsTable to
briefly explain the partitionIDs []int64 argument (what it represents, when to
pass non-nil vs nil, and how it affects the returned pseudo statistics), and
also add a short note reminding maintainers to update
getLatestVersionFromStatsTable() if this logic changes, keeping the
style/placement consistent with the existing comment block.
In `@pkg/planner/core/tests/prepare/prepare_test.go`:
- Line 1597: The test is asserting planner node IDs/glyphs verbatim (e.g.
require.Equal(t, "TableDual_8", explain.Rows()[0][0])) which breaks on benign
renumbering; change these assertions to check only the operator kind instead of
the full planner-generated identifier: use the explain.Rows() result for the
relevant rows (the same locations around the current assertions and the other
cases at lines 1603-1607), parse or match the cell string to extract the
operator kind (e.g., via strings.HasPrefix, strings.Contains, or a small regex)
and assert that it equals or contains the expected kind (like "TableDual") using
require.Contains/require.Equal on that extracted kind so the test becomes
deterministic and tolerant of ID/glyph changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b0d1d88b-6c1d-45b5-96f9-5deef36a3132
📒 Files selected for processing (15)
pkg/planner/core/casetest/plancache/plan_cache_partition_table_test.gopkg/planner/core/casetest/plancache/plan_cache_partition_test.gopkg/planner/core/casetest/planstats/plan_stats_test.gopkg/planner/core/logical_plan_builder.gopkg/planner/core/operator/logicalop/logical_datasource.gopkg/planner/core/plan_cache.gopkg/planner/core/plan_cache_utils.gopkg/planner/core/planbuilder.gopkg/planner/core/rule/collect_column_stats_usage.gopkg/planner/core/rule/rule_partition_processor.gopkg/planner/core/stats.gopkg/planner/core/stats/stats.gopkg/planner/core/tests/prepare/prepare_test.gopkg/sessionctx/stmtctx/stmtctx.gopkg/statistics/table.go
| if cacheable && !isPrepStmt && staticPartitionPrune { | ||
| cacheable = false | ||
| reason = "static partition prune mode used" | ||
| reason = "Static partition pruning mode" | ||
| sctx.GetSessionVars().StmtCtx.AppendWarning(errors.NewNoStackError("skip prepared plan-cache: " + reason)) | ||
| } |
There was a problem hiding this comment.
Minor: Misleading warning message for non-prepared statements.
The condition on line 192 explicitly checks !isPrepStmt, meaning this branch handles non-prepared statements. However, the warning message on line 195 says "skip prepared plan-cache", which is misleading since it's actually skipping the non-prepared plan cache.
Proposed fix
if cacheable && !isPrepStmt && staticPartitionPrune {
cacheable = false
reason = "Static partition pruning mode"
- sctx.GetSessionVars().StmtCtx.AppendWarning(errors.NewNoStackError("skip prepared plan-cache: " + reason))
+ sctx.GetSessionVars().StmtCtx.AppendWarning(errors.NewNoStackError("skip plan-cache: " + reason))
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if cacheable && !isPrepStmt && staticPartitionPrune { | |
| cacheable = false | |
| reason = "static partition prune mode used" | |
| reason = "Static partition pruning mode" | |
| sctx.GetSessionVars().StmtCtx.AppendWarning(errors.NewNoStackError("skip prepared plan-cache: " + reason)) | |
| } | |
| if cacheable && !isPrepStmt && staticPartitionPrune { | |
| cacheable = false | |
| reason = "Static partition pruning mode" | |
| sctx.GetSessionVars().StmtCtx.AppendWarning(errors.NewNoStackError("skip plan-cache: " + reason)) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/planner/core/plan_cache_utils.go` around lines 192 - 196, The warning
message is misleading because the branch handles non-prepared statements (checks
!isPrepStmt); update the warning text in the block that uses cacheable,
isPrepStmt, staticPartitionPrune and reason so it accurately states it's
skipping the non-prepared-plan cache (e.g., "skip non-prepared plan-cache:
"+reason or "skip plan-cache for non-prepared statement: "+reason) when calling
sctx.GetSessionVars().StmtCtx.AppendWarning; keep the rest of the logic
unchanged.
| for _, colInfo := range tblInfo.Columns { | ||
| if colInfo.State != model.StatePublic { | ||
| continue | ||
| } | ||
| colStats, ok := mergePartitionColumnStatsForRuntime(sc, tblInfo.ID, colInfo, partitionStats, realtimeCount, analyzeVersion) | ||
| if !ok { | ||
| continue | ||
| } | ||
| merged.SetCol(colInfo.ID, colStats) | ||
| merged.ColAndIdxExistenceMap.InsertCol(colInfo.ID, true) | ||
| } | ||
| for _, idxInfo := range tblInfo.Indices { | ||
| if idxInfo.State != model.StatePublic { | ||
| continue | ||
| } | ||
| idxStats, ok := mergePartitionIndexStatsForRuntime(sc, tblInfo.ID, idxInfo, partitionStats, realtimeCount, analyzeVersion) | ||
| if !ok { | ||
| continue | ||
| } | ||
| merged.SetIdx(idxInfo.ID, idxStats) | ||
| merged.ColAndIdxExistenceMap.InsertIndex(idxInfo.ID, true) |
There was a problem hiding this comment.
Keep ColAndIdxExistenceMap independent from in-memory merge success.
The merge helpers return false when one selected partition is not full-loaded in memory. With InsertCol / InsertIndex gated on successful merge, the runtime table reports Has=false for stats that do exist in storage, so later ColumnIsLoadNeeded / IndexIsLoadNeeded cannot trigger the sync load needed to finish the merge.
🧩 Suggested direction
for _, colInfo := range tblInfo.Columns {
if colInfo.State != model.StatePublic {
continue
}
+ hasStatsInAllParts := true
+ analyzedInAllParts := true
+ for _, partStats := range partitionStats {
+ hasStatsInAllParts = hasStatsInAllParts && partStats.ColAndIdxExistenceMap.Has(colInfo.ID, false)
+ analyzedInAllParts = analyzedInAllParts && partStats.ColAndIdxExistenceMap.HasAnalyzed(colInfo.ID, false)
+ }
+ if hasStatsInAllParts {
+ merged.ColAndIdxExistenceMap.InsertCol(colInfo.ID, analyzedInAllParts)
+ }
colStats, ok := mergePartitionColumnStatsForRuntime(sc, tblInfo.ID, colInfo, partitionStats, realtimeCount, analyzeVersion)
if !ok {
continue
}
merged.SetCol(colInfo.ID, colStats)
- merged.ColAndIdxExistenceMap.InsertCol(colInfo.ID, true)
}Apply the same pattern for indices.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/statistics/table.go` around lines 771 - 791, The ColAndIdxExistenceMap
updates are incorrectly gated on successful in-memory merge causing Has=false
for stats that exist on disk; always mark existence regardless of the merge
helper's boolean result. In the loop over tblInfo.Columns and tblInfo.Indices,
call merged.ColAndIdxExistenceMap.InsertCol(colInfo.ID, true) and
merged.ColAndIdxExistenceMap.InsertIndex(idxInfo.ID, true) unconditionally
(i.e., before or independent of the boolean check returned by
mergePartitionColumnStatsForRuntime and mergePartitionIndexStatsForRuntime), but
only call merged.SetCol(colInfo.ID, colStats) / merged.SetIdx(idxInfo.ID,
idxStats) when ok is true so in-memory merge behavior remains the same.
| tableDual.SetSchema(ds.Schema()) | ||
| return tableDual, nil | ||
| } | ||
| if ds.SCtx().GetSessionVars().StmtCtx.UseDynamicPartitionPrune() { |
There was a problem hiding this comment.
[P1] Wasted per-partition DataSource cloning and access-path resolution in dynamic pruning mode
Why: The PR enables FlagPartitionProcessor even when dynamic partition pruning remains active (no global index). Inside makeUnionAllChildren, the rule allocates a per-partition DataSource clone and runs resolveOptimizeHint() (recomputing access paths) for every selected/all partition, but then discards all that work by returning the original ds at line 1955 when UseDynamicPartitionPrune() is true. This makes optimizer CPU, allocations, and GC pressure scale with partition count under default settings—without changing the output plan shape—creating a latency/memory regression risk on highly-partitioned tables.
This behavior is first-exposed by this PR: on master, PartitionProcessor was not run in dynamic mode and makeUnionAllChildren had no dynamic early-return, so the wasted loop was unreachable.
Evidence:
pkg/planner/core/logical_plan_builder.go:4475-4476: setsFlagPartitionProcessoreven when dynamic pruning stays onpkg/planner/core/rule/rule_partition_processor.go:1917:newDataSource := *ds(clone per partition)pkg/planner/core/rule/rule_partition_processor.go:1929:s.resolveOptimizeHint(&newDataSource, ...)(access-path resolution per partition)pkg/planner/core/rule/rule_partition_processor.go:1955: returns originalds, discarding all per-partition work
Suggested fix direction: move the UseDynamicPartitionPrune() early-return check before the per-partition materialization loop, so resolveOptimizeHint() and clone allocation are skipped when the plan stays dynamic, while still computing StaticPrunedPartitionIDs and warnings as needed.
|
/retest |
1 similar comment
|
/retest |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
pkg/planner/core/tests/prepare/prepare_test.go (1)
1601-1609: Search the fullEXPLAINoutput instead of pinning row indexes.These assertions still assume
Selectionis row 0 andPoint_Getis row 1. A harmless extra root likeProjectionwill make this test flap again even though the intended shape is unchanged.
As per coding guidelines, "Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required."♻️ Suggested tweak
explain = tkExplain.MustQuery(fmt.Sprintf("explain for connection %d", tkProcess.ID)) + ops := make([]string, 0, len(explain.Rows())) + for _, row := range explain.Rows() { + ops = append(ops, row[0].(string)) + } + explainShape := strings.Join(ops, "\n") if pruneMode == string(variable.Dynamic) { - require.Contains(t, explain.Rows()[0][0].(string), "Selection") - require.Contains(t, explain.Rows()[1][0].(string), "Point_Get") + require.Contains(t, explainShape, "Selection") + require.Contains(t, explainShape, "Point_Get") require.False(t, tk.Session().GetSessionVars().FoundInPlanCache) tk.MustQuery(`show warnings`).Check(testkit.Rows()) } else { - require.Contains(t, explain.Rows()[0][0].(string), "Selection") - require.Contains(t, explain.Rows()[1][0].(string), "Point_Get") + require.Contains(t, explainShape, "Selection") + require.Contains(t, explainShape, "Point_Get") require.False(t, tk.Session().GetSessionVars().FoundInPlanCache) tk.MustQuery(`show warnings`).Check(testkit.Rows("Warning 1105 skip prepared plan-cache: query accesses partitioned tables is un-cacheable if tidb_partition_pruning_mode = 'static'")) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/tests/prepare/prepare_test.go` around lines 1601 - 1609, The test currently asserts `Selection` and `Point_Get` by fixed row indexes from `explain.Rows()` (via `tkExplain.MustQuery(fmt.Sprintf("explain for connection %d", tkProcess.ID))`), which is fragile; update the assertions to scan all rows of `explain.Rows()` and assert that at least one row contains "Selection" and at least one contains "Point_Get" (instead of indexing [0] and [1]), keeping the existing checks of `tk.Session().GetSessionVars().FoundInPlanCache` and `tk.MustQuery("show warnings")` behavior intact; make sure to apply this change in both branches that check `pruneMode == string(variable.Dynamic)` and the else branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/casetest/plancache/plan_cache_partition_table_test.go`:
- Around line 473-479: The test currently treats an empty SHOW WARNINGS result
as acceptable on cache-miss paths; update the assertions around the
tk.MustQuery("show warnings "+comment).Rows() call so that an empty warnings
slice fails the test explicitly (e.g. require.Greater(t, len(warnings), 0,
"expected a warning on cache miss") or require.False(t, len(warnings) == 0,
...)) before checking warnings[0][0], warnings[0][1], and calling
requireStaticPartitionPruneOrUnsafeRangeWarning; apply the same explicit
non-empty assertion to the second occurrence at the block that covers lines
560-563 so silent warning-less misses no longer pass.
In `@pkg/planner/core/casetest/plancache/plan_cache_partition_test.go`:
- Around line 47-55: The helper
requireStaticPartitionPruneOrOverOptimizedWarning in function
requireStaticPartitionPruneOrOverOptimizedWarning currently only accepts the
"skip prepared plan-cache: Batch/PointGet plans may be over-optimized" message
for over-optimized warnings; update the conditional to also accept the "skip
non-prepared plan-cache: Batch/PointGet plans may be over-optimized" and the
generic "skip plan-cache: Batch/PointGet plans may be over-optimized" variants
so it mirrors the three accepted static partition pruning messages (the checks
using strings.Contains in this function).
In `@pkg/planner/core/plan_cache.go`:
- Around line 365-396: The code in cachedPlanUsesStaticPartitionPruning
currently inspects TablePlans[0]/IndexPlans[0] for a
PhysicalTableScan/PhysicalIndexScan inside PhysicalTableReader,
PhysicalIndexReader, PhysicalIndexLookUpReader and PhysicalIndexMergeReader, but
pushed-down plans may have Projection/Selection at index 0 with the actual scan
at the end; change each branch to walk the corresponding p.TablePlans or
p.IndexPlans slice (preferably from the end toward the front) to find the first
element that type-asserts to *physicalop.PhysicalTableScan or
*physicalop.PhysicalIndexScan and then call
dynamicPartitionAccessUsesSubset(sctx.GetPlanCtx(), <foundScan>.Table,
p.PlanPartInfo); keep the same error/skip handling semantics once the real scan
node is located.
In `@pkg/statistics/table.go`:
- Around line 984-985: The code passes the slice topNs (which may contain nil
entries) directly to MergeTopN causing crashes when some partitions lack TopN;
before calling MergeTopN, filter out nil entries from topNs (e.g., build a new
slice containing only non-nil *statistics.TopN objects) and pass that filtered
slice into MergeTopN, then use the resulting poppedTopN when calling
MergePartitionHist2GlobalHist (ensure any downstream logic expecting the
original length is updated to use the filtered/popped results).
---
Nitpick comments:
In `@pkg/planner/core/tests/prepare/prepare_test.go`:
- Around line 1601-1609: The test currently asserts `Selection` and `Point_Get`
by fixed row indexes from `explain.Rows()` (via
`tkExplain.MustQuery(fmt.Sprintf("explain for connection %d", tkProcess.ID))`),
which is fragile; update the assertions to scan all rows of `explain.Rows()` and
assert that at least one row contains "Selection" and at least one contains
"Point_Get" (instead of indexing [0] and [1]), keeping the existing checks of
`tk.Session().GetSessionVars().FoundInPlanCache` and `tk.MustQuery("show
warnings")` behavior intact; make sure to apply this change in both branches
that check `pruneMode == string(variable.Dynamic)` and the else branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 690d3058-71c3-4104-ab03-8f1f7a6c19df
📒 Files selected for processing (8)
pkg/planner/core/casetest/plancache/plan_cache_partition_table_test.gopkg/planner/core/casetest/plancache/plan_cache_partition_test.gopkg/planner/core/plan_cache.gopkg/planner/core/planbuilder.gopkg/planner/core/rule/rule_partition_processor.gopkg/planner/core/stats/stats.gopkg/planner/core/tests/prepare/prepare_test.gopkg/statistics/table.go
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
pkg/planner/core/rule/rule_partition_processor.go (2)
1921-1940: Redundant length check and early return structure looks good.The restructuring correctly places the dynamic-mode early return (lines 1936-1941) before the per-partition child construction loop, which addresses the concern from the previous review about wasted cloning work in dynamic mode.
One minor observation: the condition
len(prunedPartitionIDs) > 0on line 1937 is redundant since thelen(prunedPartitionIDs) == 0case already returns a TableDual at line 1930. Consider simplifying:if ds.SCtx().GetSessionVars().StmtCtx.UseDynamicPartitionPrune() { - if ds.SCtx().GetSessionVars().EnableSelectedPartitionStats && staticPruned && len(prunedPartitionIDs) > 0 { + if ds.SCtx().GetSessionVars().EnableSelectedPartitionStats && staticPruned { ds.StaticPrunedPartitionIDs = prunedPartitionIDs } return ds, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/rule/rule_partition_processor.go` around lines 1921 - 1940, The conditional inside the dynamic-prune branch redundantly checks len(prunedPartitionIDs) > 0 even though the zero-length case already returns earlier; update the condition in the block that sets ds.StaticPrunedPartitionIDs (inside the if ds.SCtx().GetSessionVars().StmtCtx.UseDynamicPartitionPrune() branch) to only test ds.SCtx().GetSessionVars().EnableSelectedPartitionStats && staticPruned so the redundant length check is removed while preserving the existing assignment to ds.StaticPrunedPartitionIDs and the subsequent early return of ds.
1943-1950: O(n²) complexity from repeated linear search in static pruning path.The
slices.IndexFunccall inside the loop performs a linear search throughpi.Definitionsfor each pruned partition, resulting in O(n²) complexity when many partitions are pruned. While dynamic pruning (the common case) returns early before this loop, the static pruning path could be slow for tables with many partitions.Consider building an index map once before the loop:
♻️ Suggested optimization
children := make([]base.LogicalPlan, 0, len(prunedPartitionIDs)) +partIdxByID := make(map[int64]int, len(pi.Definitions)) +for i, def := range pi.Definitions { + partIdxByID[def.ID] = i +} for _, partitionID := range prunedPartitionIDs { definition := usedDefinition[partitionID] - partIdx := slices.IndexFunc(pi.Definitions, func(def model.PartitionDefinition) bool { - return def.ID == partitionID - }) - if partIdx < 0 { + partIdx, found := partIdxByID[partitionID] + if !found { continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/rule/rule_partition_processor.go` around lines 1943 - 1950, The loop over prunedPartitionIDs currently calls slices.IndexFunc on pi.Definitions for each partition (via partIdx := slices.IndexFunc(pi.Definitions, ...)), producing O(n²) behavior; instead, before the loop build a map from partition ID to its index (e.g., iterate pi.Definitions once to populate map[string]int or map[model.PartitionID]int), then inside the loop look up partIdx = indexMap[partitionID] and continue if missing; update uses of usedDefinition and any subsequent logic to use the precomputed partIdx to avoid repeated linear searches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/sessionctx/variable/session.go`:
- Line 2350: The mapping includes an undefined vardef constant
vardef.DefTiDBOptEnableAdvancedJoinReorder causing typecheck to fail; either (A)
correct the key to the existing vardef constant (e.g., replace
vardef.DefTiDBOptEnableAdvancedJoinReorder with the actual defined constant such
as vardef.DefTiDBOptEnableJoinReorder or the project's canonical name) or (B)
add the missing constant definition to the vardef package with the appropriate
type/value; update the mapping in session.go where
TiDBOptEnableAdvancedJoinReorder is referenced to use the corrected vardef
symbol so the build passes.
- Around line 1227-1229: The codebase was updated to rename/remove
EnhanceIndexJoinBuildV2 from SessionVars but sysvar.go still references
s.EnhanceIndexJoinBuildV2; update the setter/getter in
pkg/sessionctx/variable/sysvar.go to read/write the new SessionVars field
TiDBOptEnableAdvancedJoinReorder (or add a temporary compatibility field
EnhanceIndexJoinBuildV2 on the SessionVars struct that maps to
TiDBOptEnableAdvancedJoinReorder) so the typecheck error is resolved; ensure you
update both the setter function and any reads/assignments that use
EnhanceIndexJoinBuildV2 to the new symbol TiDBOptEnableAdvancedJoinReorder (or
the compatibility field) and run a build to verify.
---
Nitpick comments:
In `@pkg/planner/core/rule/rule_partition_processor.go`:
- Around line 1921-1940: The conditional inside the dynamic-prune branch
redundantly checks len(prunedPartitionIDs) > 0 even though the zero-length case
already returns earlier; update the condition in the block that sets
ds.StaticPrunedPartitionIDs (inside the if
ds.SCtx().GetSessionVars().StmtCtx.UseDynamicPartitionPrune() branch) to only
test ds.SCtx().GetSessionVars().EnableSelectedPartitionStats && staticPruned so
the redundant length check is removed while preserving the existing assignment
to ds.StaticPrunedPartitionIDs and the subsequent early return of ds.
- Around line 1943-1950: The loop over prunedPartitionIDs currently calls
slices.IndexFunc on pi.Definitions for each partition (via partIdx :=
slices.IndexFunc(pi.Definitions, ...)), producing O(n²) behavior; instead,
before the loop build a map from partition ID to its index (e.g., iterate
pi.Definitions once to populate map[string]int or map[model.PartitionID]int),
then inside the loop look up partIdx = indexMap[partitionID] and continue if
missing; update uses of usedDefinition and any subsequent logic to use the
precomputed partIdx to avoid repeated linear searches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6c55db53-3e47-49a5-87b9-9938a48f3df1
📒 Files selected for processing (7)
pkg/planner/core/casetest/planstats/plan_stats_test.gopkg/planner/core/rule/rule_partition_processor.gopkg/sessionctx/vardef/tidb_vars.gopkg/sessionctx/variable/session.gopkg/sessionctx/variable/setvar_affect.gopkg/sessionctx/variable/sysvar.gopkg/statistics/handle/globalstats/global_stats_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/statistics/handle/globalstats/global_stats_test.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/planner/cardinality/ndv_test.go (1)
32-46: Test covers core paths but misses edge cases.The test validates the happy path (aggregating two valid partition stats) and the pseudo-histogram failure path, which aligns with the implementation. Consider adding coverage for:
- Nil partition entry: Verify
AggregateSelectedPartitionCounts([]*statistics.Table{validStats, nil})returnsok=false.- Empty slice: Verify behavior when
partitionStatsis empty (currently returns0, 0, trueper the implementation—confirm this is intended).🧪 Suggested additional test cases
+ // Test nil partition entry + _, _, ok = cardinality.AggregateSelectedPartitionCounts([]*statistics.Table{ + {HistColl: *statistics.NewHistColl(1, 10, 2, 0, 0)}, + nil, + }) + require.False(t, ok) + + // Test empty slice - returns ok=true with zero counts + realtimeCount, modifyCount, ok = cardinality.AggregateSelectedPartitionCounts([]*statistics.Table{}) + require.True(t, ok) + require.Equal(t, int64(0), realtimeCount) + require.Equal(t, int64(0), modifyCount)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/cardinality/ndv_test.go` around lines 32 - 46, Add two unit tests for cardinality.AggregateSelectedPartitionCounts: one that passes a slice containing a valid *statistics.Table and a nil entry (e.g., []*statistics.Table{{HistColl: *statistics.NewHistColl(...)} , nil}) and asserts the function returns ok == false, and another that passes an empty slice and asserts the returned realtimeCount and modifyCount are 0 and ok is true (or adjust the expectation if implementation should change). Reference the AggregateSelectedPartitionCounts function and statistics.Table / statistics.NewHistColl / statistics.PseudoHistColl to locate where to add these cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/cardinality/ndv_test.go`:
- Around line 32-46: Add two unit tests for
cardinality.AggregateSelectedPartitionCounts: one that passes a slice containing
a valid *statistics.Table and a nil entry (e.g., []*statistics.Table{{HistColl:
*statistics.NewHistColl(...)} , nil}) and asserts the function returns ok ==
false, and another that passes an empty slice and asserts the returned
realtimeCount and modifyCount are 0 and ok is true (or adjust the expectation if
implementation should change). Reference the AggregateSelectedPartitionCounts
function and statistics.Table / statistics.NewHistColl /
statistics.PseudoHistColl to locate where to add these cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b862f215-27a2-467c-bedc-a5d91f662f92
📒 Files selected for processing (3)
pkg/planner/cardinality/ndv_test.gopkg/planner/cardinality/pseudo.gopkg/planner/core/stats/stats.go
Codecov Report✅ All modified and coverable lines are covered by tests. Please upload reports for the commit 2bc059a to get more accurate results. Additional details and impacted files@@ Coverage Diff @@
## master #66920 +/- ##
================================================
- Coverage 77.6943% 77.4526% -0.2417%
================================================
Files 2016 1944 -72
Lines 552129 538622 -13507
================================================
- Hits 428973 417177 -11796
+ Misses 121414 121016 -398
+ Partials 1742 429 -1313
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
1 similar comment
|
/retest |
7d4be78 to
2bc059a
Compare
|
@hawkingrei: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@hawkingrei: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #66919
Problem Summary:
TiDB uses dynamic partition pruning by default. In some queries that only touch a subset of partitions, cardinality estimation still falls back to the partition table stats path, which is less accurate than using stats merged from the selected partitions. Reusing such partition-specific pruning information through plan cache is also unsafe.
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Bug Fixes
Tests