planner: correlate subquery rule (#66206) by ti-chi-bot · Pull Request #68752 · pingcap/tidb

ti-chi-bot · 2026-05-29T07:26:19Z

This is an automated cherry-pick of #66206

What problem does this PR solve?

Issue Number: close #66320

Problem Summary:

What changed and how does it work?

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

New Features
- Alternative logical-planning rounds to explore correlated Apply transformations, improving plans for correlated subqueries and IN/EXISTS patterns.
- Automated generation of candidate plan variants and binding-aware plan suggestions for more robust query plans.
Tests
- Expanded test coverage: plan generation, binding verification, correlate-rule behavior (including parallel apply), and EXPLAIN/EXPLAIN ANALYZE/EXPLORE output validations.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot · 2026-05-29T07:26:24Z

@terry1purcell This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

ti-chi-bot · 2026-05-29T07:26:26Z

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

coderabbitai · 2026-05-29T07:26:40Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1e7c8fc7-98d5-45a0-b928-942e0fc6f5f7

📥 Commits

Reviewing files that changed from the base of the PR and between 3f03dc3 and 7c206d7.

📒 Files selected for processing (3)

pkg/planner/optimize.go
pkg/sessionctx/stmtctx/stmtctx.go
pkg/util/context/plancache.go

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/planner/optimize.go
pkg/sessionctx/stmtctx/stmtctx.go

📝 Walkthrough

Walkthrough

This PR implements alternative logical-plan rounds (including a correlate round), a CorrelateSolver rule to convert eligible semi-joins into correlated Apply plans, plan-cloning utilities, a BFS binding-plan generator, and comprehensive tests and testdata covering correlate and binding behaviors.

Changes

Correlate alternatives: non-correlated subquery optimization

Layer / File(s)	Summary
Alternative logical plan framework `pkg/planner/optimize.go`, `pkg/sessionctx/stmtctx/stmtctx.go`, `pkg/sessionctx/variable/session.go`	Post-logical-build alternative-round orchestration, session toggle for correlate round, statement-context save/restore, and per-round alternative-plan signals.
Expression rewriting and CorrelateSolver rule `pkg/planner/core/expression_rewriter.go`, `pkg/planner/core/rule_correlate.go`, `pkg/planner/core/operator/logicalop/logical_join.go`, `pkg/planner/core/rule/logical_rules.go`, `pkg/planner/core/optimizer.go`	Honor correlate-round signals in subquery rewriting (prevent decorrelation or mark PreferCorrelate), add CorrelateSolver to transform eligible semi-joins into LogicalApply by cloning/lifting inner subtrees, resetting stats, and enforcing inner Limit 1; add PreferCorrelate field and adjust rule ordering/flags.
Plan cloning and optimizer support `pkg/planner/core/plan_clone_utils.go`, `pkg/planner/core/optimizer.go`	Fast clone for PointGet and shallow cloning utilities for logical plans (DataSource/Join/Selection/Projection/Agg/Limit/Sort/TopN) with fresh identities and isolated access paths; update pruning/flag logic.
Binding plan generation `pkg/bindinfo/binding_plan_generation.go`	BFS-based exploration of optimizer variable/fix combinations, leading-table/index-hint/NO_DECORRELATE enumeration, hint injection, brief-plan generation, and BindingPlanInfo/BindSQL synthesis.
Test infrastructure and correlate tests `pkg/planner/core/BUILD.bazel`, `pkg/planner/core/casetest/rule/BUILD.bazel`, `pkg/planner/core/casetest/rule/main_test.go`, `pkg/planner/core/casetest/rule/rule_correlate_test.go`, `pkg/planner/core/casetest/rule/testdata/correlate_suite_*.json`	Register new test suites and files, increase test shard count, add correlate-rule tests (NULL semantics, alternative Apply selection, parallel Apply checks, cost-factor tuning) and supporting JSON fixtures.
Binding auto tests `pkg/bindinfo/binding_auto_test.go`	New/expanded tests for plan generation, EXPLAIN EXPLORE outputs (digests, index hints, no_decorrelate markers), relevant-opt-vars/fixes recording, ANALYZE metrics, and verify-and-bind end-to-end flows.
Optimizer test updates `pkg/planner/core/optimizer_test.go` (⚠️ merge conflicts)	Adds TiFlash hash-join validation and opt-rule-list flag-count alignment tests. This block contains unresolved merge-conflict markers and must be resolved before merge.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

pingcap/tidb#66206: Extends binding auto tests to cover correlate-subquery scenarios and alternative-logical-plan signals (TiDBOptEnableAlternativeLogicalPlans).
pingcap/tidb#66677: Adds plan-build state isolation used by multi-round alternative planning (save/restore PlanCacheTracker and StatementContext).
pingcap/tidb#68796: Overlapping correlate-round and CorrelateSolver work (similar session-flag, PreferCorrelate, and rule additions).

Suggested labels

approved, lgtm

Suggested reviewers

terry1purcell
guo-shaoge
qw4990

"🐰 I hopped through plans with curious eyes,
Tied semi-joins into Apply surprise.
I cloned each branch, then nudged hints right—
Now correlated lookups bloom in light. ✨"

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.76% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The PR description is largely empty, only containing template boilerplate and a minimal issue reference (close `#66320`) without explaining what problem is solved, how the solution works, or rationale.	Fill in the 'Problem Summary' and 'What changed and how does it work?' sections with details explaining the correlate subquery rule implementation and its purpose.
Out of Scope Changes check	❓ Inconclusive	While most changes (CorrelateSolver, expression_rewriter updates, logical_join PreferCorrelate field) align with the stated goal of correlating subqueries, additional broader infrastructure changes (binding auto-plan generation, plan cloning, optimizer alternative-round logic) lack clear justification in the minimal description.	Clarify in the PR description why binding auto-plan generation, plan_clone_utils, and alternative-round infrastructure are necessary prerequisites for the correlate subquery feature.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'planner: correlate subquery rule (`#66206`)' clearly identifies the main change: adding a correlate subquery rule to the planner component, directly addressing the linked issue `#66320`.
Linked Issues check	✅ Passed	The PR addresses the core requirements of issue `#66320`: implementing a CorrelateSolver rule that converts non-correlated semi-joins into correlated Apply plans with index lookups, transforming the subquery from an uncorrelated execution plan to a correlated one as specified.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 20

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (1)

pkg/bindinfo/binding_auto_test.go-341-345 (1)

341-345: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix unsafe output[i].Plan[rowID] write in record mode.

This branch can index out of range if it is taken (len < rowID cannot satisfy direct indexing at rowID). Use overwrite-else-append semantics instead.

Proposed fix

 			testdata.OnRecord(func() {
 				output[i].SQL = sql
-				if len(output[i].Plan) < rowID {
-					output[i].Plan[rowID] = plan
-				} else {
+				if rowID < len(output[i].Plan) {
+					output[i].Plan[rowID] = plan
+				} else {
 					output[i].Plan = append(output[i].Plan, plan)
 				}
 			})

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/bindinfo/binding_auto_test.go` around lines 341 - 345, The current branch
in pkg/bindinfo/binding_auto_test.go writes to output[i].Plan[rowID] when
len(output[i].Plan) < rowID which can index out of range; change the logic in
that block (the code that assigns to output[i].Plan based on rowID and plan) to
use overwrite-else-append semantics: if rowID is less than len(output[i].Plan)
then overwrite output[i].Plan[rowID] with plan, otherwise append plan to
output[i].Plan. Ensure you update the branch that currently checks the length
and performs the unsafe write to use this corrected condition around
output[i].Plan and rowID.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/bindinfo/binding_plan_generation.go`:
- Around line 755-775: extractSelectTableNames currently walks the entire SELECT
AST (via selStmt.Accept(extractor)) and thus collects table names from nested
subqueries; change the tableNameExtractor so it only collects tables from the
current/top-level query block by not descending into nested SelectStmt nodes
(e.g. add a depth/flag on tableNameExtractor or make its Visit/Enter methods
return Skip for child traversal when encountering a *ast.SelectStmt that is not
the root), so extractSelectTableNames (and its call selStmt.Accept(extractor))
will only produce table names in scope for the outer SELECT and exclude subquery
tables from possibleLeading2.
- Around line 462-514: The sel.TableHints slice is not restored after every
explored state because the noDecorrelate hint (state.noDecorrelateQB) is
appended outside the rollback/defer that restores sel.TableHints, causing hint
leakage across BFS states; fix it in binding_plan_generation.go by moving the
originalHintsLen capture and the defer restoration (that slices sel.TableHints
back to originalHintsLen) to cover both the leading/index-hint block and the
noDecorrelate append, or alternatively move the noDecorrelate hint append inside
the existing defer scope so that any addition to sel.TableHints (including the
TableOptimizerHint created for state.noDecorrelateQB) is rolled back after
exploring the state; reference sel.TableHints, state.noDecorrelateQB,
state.leading2 and state.indexHints to locate the changes.

In `@pkg/planner/core/BUILD.bazel`:
- Around line 56-61: Remove the leftover merge conflict markers (<<<<<<<,
=======, >>>>>>>) in the BUILD.bazel srcs list and reconcile the entries so the
list contains the intended source files (include "rule_collect_plan_stats.go",
"rule_column_pruning.go" and "rule_correlate.go" as appropriate), ensuring the
srcs array is a valid, comma-separated string list with no conflict markers
remaining.

In `@pkg/planner/core/casetest/rule/BUILD.bazel`:
- Around line 8-13: The BUILD.bazel target contains unresolved Git merge
conflict markers (<<<<<<<, =======, >>>>>>>) around the test file list; remove
these markers and merge the two variants into a single, valid list (e.g.,
include "rule_cdc_join_reorder_test.go", "rule_common_handle_ordering_test.go",
"rule_correlate_test.go" along with the other files) so the target is a proper
Bazel srcs array and the rule tests (the files named in the conflict) can build.

In `@pkg/planner/core/casetest/rule/main_test.go`:
- Around line 32-49: Remove the leftover merge conflict markers and fix TestMain
to a single coherent block: delete the <<<<<<<, =======, and >>>>>>> markers and
keep the intended LoadTestSuiteData calls (use the variant with the boolean
flag, e.g. LoadTestSuiteData("testdata", "outer2inner", true) and the additional
suites like "outer_to_semi_join_suite", "correlate_suite",
"cdc_join_reorder_suite", "order_aware_join_reorder_suite"); ensure the TestMain
function and any suite helper loops use the correct LoadTestSuiteData signature
(the version accepting the bool) and remove duplicated or conflicting lines so
the file compiles.

In `@pkg/planner/core/casetest/rule/rule_correlate_test.go`:
- Around line 167-197: The test currently only yields one qualifying outer row,
making the Concurrency>1 check flaky; before running the explains, ensure
multiple outer rows qualify by inserting additional rows into t1 (e.g. via
tk.MustExec("insert into t1 ...") or similar) or by relaxing the query predicate
(modify sql to use a less selective condition such as b IN (...)); update the
setup immediately before the explain/analyze block (affecting the sql variable
and the analyzeRows/foundConcurrency check) so the Apply operator has multiple
outer rows and the EXPLAIN ANALYZE concurrency assertion becomes deterministic.

In `@pkg/planner/core/expression_rewriter.go`:
- Around line 1058-1063: The code uses
vardef.TiDBOptEnableAlternativeLogicalPlans but the package isn’t imported; add
the import "github.com/pingcap/tidb/pkg/sessionctx/vardef" to the import block
in pkg/planner/core/expression_rewriter.go so references like
vardef.TiDBOptEnableAlternativeLogicalPlans (used alongside
b.ctx.GetSessionVars().RecordRelevantOptVar and EnableCorrelateSubquery) compile
successfully.

In `@pkg/planner/core/optimizer_test.go`:
- Around line 566-614: The test references wrong packages/types: replace
coretestsdk.MockContext() with the actual MockContext from the core package (use
core.MockContext()), import the joinversion package and use
joinversion.HashJoinVersionLegacy/HashJoinVersionOptimized where used, and use
the actual PhysicalHashJoin type as defined in
pkg/planner/core/physical_plans.go (replace physicalop.PhysicalHashJoin with the
PhysicalHashJoin type from the core/planner package or adjust the import alias
to point to that file) so imports and type references match the real symbols.
- Around line 563-630: Remove the leftover merge-conflict markers (<<<<<<< HEAD,
=======, >>>>>>>) in pkg/planner/core/optimizer_test.go and keep the incoming
test code block (the tests TestCanTiFlashUseHashJoinV2 and
TestOptRuleListFlagAlignment) as a single, contiguous section; ensure no
conflict markers remain so the file compiles and the functions
TestCanTiFlashUseHashJoinV2 and TestOptRuleListFlagAlignment are declared
normally.

In `@pkg/planner/core/optimizer.go`:
- Around line 101-107: Resolve the merge conflict in the optRuleList by removing
the conflict markers (<<<<<<<, =======, >>>>>>>) and ensure the final rule
entries include the intended rules in order: &rule.OuterJoinToSemiJoin{},
&CorrelateSolver{}, and &rule.ColumnPruner{} (remove the duplicated
&ColumnPruner{} entry if present), so optRuleList contains the correct rule
instances without inline conflict markers.

In `@pkg/planner/core/plan_clone_utils.go`:
- Around line 122-150: The compiler error arises because cloneDataSource and
rule_correlate.go reference DataSource.AllPossibleAccessPaths which doesn't
exist on release-8.5; to fix it, add a new field AllPossibleAccessPaths
[]*util.AccessPath to the DataSource type (logical_datasource.go) and ensure any
constructors/initializers set it (e.g., copy from PossibleAccessPaths where
appropriate), then keep the clone logic in cloneDataSource and any uses in
rule_correlate.go that call ap.Clone() working against that new field; ensure
the new field is included in any schema/clone operations and initialized to a
nil or empty slice to preserve backward compatibility.
- Around line 215-261: cloneTopN currently calls
clone.SetSchema(tn.Schema().Clone()) while cloneSort (and LogicalSort) rely on
schema inheritance from BaseLogicalPlan; remove the inconsistent schema override
in cloneTopN to match cloneSort: delete the clone.SetSchema(...) call in
cloneTopN (function cloneTopN and type LogicalTopN) so schema is inherited from
its child via BaseLogicalPlan, or alternatively, if LogicalTopN truly requires
an explicit schema, replace the removal with a short comment in cloneTopN
explaining why LogicalTopN deviates and keep the SetSchema; ensure the change is
limited to the cloneTopN function.

In `@pkg/planner/core/rule_correlate.go`:
- Around line 168-178: The call to PredicatePushDown in rule_correlate.go is
using the old signature and expecting an error; update the call on sel to match
the current signature PredicatePushDown(predicates []expression.Expression, opt
*optimizetrace.LogicalOptimizeOp) which returns only (pushedPredicates, plan).
Replace "_, innerPlan, err := sel.PredicatePushDown(nil)" with a two-value
assignment that supplies both arguments (e.g., nil and the appropriate optimize
op variable or nil for the second arg) and remove the subsequent "if err != nil"
branch and its error handling; ensure you use the returned plan value
(innerPlan) and handle the pushedPredicates value as needed by the surrounding
logic.
- Around line 110-126: Replace the incorrect references and wrong
PredicatePushDown call in rule_correlate.go: use logicalop.LeftOuterSemiJoin and
logicalop.AntiLeftOuterSemiJoin (the join-type constants live in
pkg/planner/core/operator/logicalop) and update imports accordingly; for the
selection pushdown use the correct LogicalSelection.PredicatePushDown signature
— pass the second opt argument (e.g., nil) and capture the two return values it
actually returns (predicates and innerPlan) instead of expecting three returns
and handling an err; adjust the call site (sel.PredicatePushDown) to accept two
results and remove the erroneous err handling.

In `@pkg/planner/core/rule/logical_rules.go`:
- Around line 42-46: Remove the leftover merge markers (<<<<<<<, =======,
>>>>>>>) from the optimizer flags const block and merge the two branches by
ensuring both FlagOuterJoinToSemiJoin and FlagCorrelate appear as consecutive
entries in the const declaration (preserving surrounding constants and
ordering), e.g., include FlagOuterJoinToSemiJoin followed by FlagCorrelate in
the list used by the const block that defines the optimizer flags so the file is
valid Go without conflict markers.

In `@pkg/planner/optimize.go`:
- Around line 472-532: The file contains unresolved Git conflict markers
(<<<<<<<, =======, >>>>>>>) around the new alternative-round helpers and the
alternativeRounds variable; remove the conflict markers and keep the intended
merged code: the functions shouldTryNonDecorrelationRound,
shouldTryOrderAwareReorderRound, shouldTryCorrelateRound, the alternativeRound
type, the savedEnableCorrelateSubquery var, and the alternativeRounds array
(including the "non-decorrelate", "order-aware-reorder", and "correlate" entries
with their adjustFlag/enabled/setup/cleanup closures). Ensure there is only one
definition of alternativeRounds and no leftover conflict markers so the file
compiles.
- Around line 592-656: Resolve the leftover git conflict markers in
pkg/planner/optimize.go by merging the two branches: remove the <<<<<<<,
=======, >>>>>>> markers and produce a single control flow that preserves both
behaviors — keep the existing cascades early-return (check
sessVars.GetEnableCascadesPlanner() and call
cascades.DefaultOptimizer.FindBestPlan(sctx, logic)) before running the
alternativeRounds logic, and retain the pre-computation of enabledRounds,
restoreLogicalPlanBuildCtx, the per-round closure that calls
buildAndOptimizeLogicalPlanRound, and the nonLogical handling and logging;
ensure core.RecheckCTE(logic) remains prior to these checks and that variables
returned (p, names, cost/0, err) follow the same semantics as originally
intended.
- Around line 502-528: The package-level savedEnableCorrelateSubquery causes
races across concurrent sessions; remove the global savedEnableCorrelateSubquery
and instead store the saved value per-round using a closure or a new field on
alternativeRound so setup/cleanup operate on session-local state. Concretely, in
the correlate alternativeRound literal (the element with name "correlate")
change its setup to capture a local saved := sv.EnableCorrelateSubquery (or
assign saved into a new alternativeRound.saved bool field) before mutating
sv.EnableCorrelateSubquery, and make cleanup restore that captured/field value;
update the alternativeRound type if needed so setup/cleanup signatures still
reference the same session variable type (variable.SessionVars) and no
package-level state remains.

In `@pkg/sessionctx/stmtctx/stmtctx.go`:
- Around line 449-465: The file contains unresolved Git conflict markers (e.g.,
"<<<<<<<", "=======", ">>>>>>>") around the stmtctx struct fields, which
prevents compilation; remove the conflict markers and reconcile the fields so
the struct contains the intended boolean flags
(AlternativeLogicalPlanDecorrelatedApply,
AlternativeLogicalPlanSameOrderIndexJoin,
AlternativeLogicalPlanOrderAwareJoinReorder,
AlternativeLogicalPlanPreferCorrelate) with proper comments and punctuation,
preserving the correct ordering and formatting of the declarations (locate the
block around the stmtctx struct where those symbol names appear).
- Around line 592-666: Remove the Git conflict markers and merge the duplicated
planner-state block so the file contains a single, consistent implementation of
SaveLogicalPlanBuildState, RestoreLogicalPlanBuildState and the various
Mark/Reset methods; then add a concrete definition for the referenced type
LogicalPlanBuildState (a struct) that includes fields referenced in
Restore/Save: warnings, extraWarnings, tables, tableStats, lockTableIDs,
tblInfo2UnionScan, useDynamicPruneMode, viewDepth, colRefFromUpdatePlan
(matching the Copy/CopyFrom methods used), and the plan-cache booleans/values
(planCacheUseCache, planCacheType, planCacheUnqualified, planCacheForce,
planCacheAlwaysWarn); place the type definition near other stmtctx types in the
same file and ensure imports/usages (slices.Clone, maps.Clone,
contextutil.NewRangeFallbackHandler, PlanCacheTracker methods) remain correct so
the package compiles.

---

Minor comments:
In `@pkg/bindinfo/binding_auto_test.go`:
- Around line 341-345: The current branch in pkg/bindinfo/binding_auto_test.go
writes to output[i].Plan[rowID] when len(output[i].Plan) < rowID which can index
out of range; change the logic in that block (the code that assigns to
output[i].Plan based on rowID and plan) to use overwrite-else-append semantics:
if rowID is less than len(output[i].Plan) then overwrite output[i].Plan[rowID]
with plan, otherwise append plan to output[i].Plan. Ensure you update the branch
that currently checks the length and performs the unsafe write to use this
corrected condition around output[i].Plan and rowID.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ce49602f-db52-4e22-b92a-6398d49933ef

📥 Commits

Reviewing files that changed from the base of the PR and between 651272b and 3f03dc3.

📒 Files selected for processing (19)

pkg/bindinfo/binding_auto_test.go
pkg/bindinfo/binding_plan_generation.go
pkg/planner/core/BUILD.bazel
pkg/planner/core/casetest/rule/BUILD.bazel
pkg/planner/core/casetest/rule/main_test.go
pkg/planner/core/casetest/rule/rule_correlate_test.go
pkg/planner/core/casetest/rule/testdata/correlate_suite_in.json
pkg/planner/core/casetest/rule/testdata/correlate_suite_out.json
pkg/planner/core/casetest/rule/testdata/correlate_suite_xut.json
pkg/planner/core/expression_rewriter.go
pkg/planner/core/operator/logicalop/logical_join.go
pkg/planner/core/optimizer.go
pkg/planner/core/optimizer_test.go
pkg/planner/core/plan_clone_utils.go
pkg/planner/core/rule/logical_rules.go
pkg/planner/core/rule_correlate.go
pkg/planner/optimize.go
pkg/sessionctx/stmtctx/stmtctx.go
pkg/sessionctx/variable/session.go

coderabbitai · 2026-05-29T07:42:01Z

+	if sel, isSel := stmt.(*ast.SelectStmt); isSel {
+		hasIndexHint := false
+		for _, indexHint := range state.indexHints {
+			if indexHint != nil {
+				hasIndexHint = true
+				break
+			}
+		}
+		if (state.leading2[0] != nil && state.leading2[1] != nil) || hasIndexHint {
+			originalHintsLen := len(sel.TableHints)
+			defer func() {
+				sel.TableHints = sel.TableHints[:originalHintsLen]
+			}()
+			if state.leading2[0] != nil && state.leading2[1] != nil {
+				leadingHint := &ast.TableOptimizerHint{
+					HintName: ast.NewCIStr(hint.HintLeading),
+					Tables: []ast.HintTable{
+						{
+							DBName:    ast.NewCIStr(state.leading2[0].schema),
+							TableName: ast.NewCIStr(state.leading2[0].HintName()),
+						},
+						{
+							DBName:    ast.NewCIStr(state.leading2[1].schema),
+							TableName: ast.NewCIStr(state.leading2[1].HintName()),
+						},
+					},
+				}
+				sel.TableHints = append(sel.TableHints, leadingHint)
+			}
+			for _, indexHint := range state.indexHints {
+				if indexHint == nil {
+					continue
+				}
+				hintNode := &ast.TableOptimizerHint{
+					HintName: ast.NewCIStr(hint.HintUseIndex),
+					Tables: []ast.HintTable{
+						{
+							DBName:    ast.NewCIStr(indexHint.table.schema),
+							TableName: ast.NewCIStr(indexHint.table.HintName()),
+						},
+					},
+					Indexes: []ast.CIStr{ast.NewCIStr(indexHint.index)},
+				}
+				sel.TableHints = append(sel.TableHints, hintNode)
+			}
+		}
+		if state.noDecorrelateQB.L != "" {
+			noDecorrelateHint := &ast.TableOptimizerHint{
+				HintName: ast.NewCIStr(hint.HintNoDecorrelate),
+				QBName:   state.noDecorrelateQB,
+			}
+			sel.TableHints = append(sel.TableHints, noDecorrelateHint)
+		}


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Restore sel.TableHints after every explored state.

noDecorrelateQB is appended outside the rollback block, so a state that only sets NO_DECORRELATE() leaves that hint on the shared stmt and contaminates later BFS states. That makes later candidates observe hints from previous states instead of their own state only.

Suggested fix

if sel, isSel := stmt.(*ast.SelectStmt); isSel { + originalHintsLen := len(sel.TableHints) + defer func() { + sel.TableHints = sel.TableHints[:originalHintsLen] + }() + hasIndexHint := false for _, indexHint := range state.indexHints { if indexHint != nil { hasIndexHint = true break } } if (state.leading2[0] != nil && state.leading2[1] != nil) || hasIndexHint { - originalHintsLen := len(sel.TableHints) - defer func() { - sel.TableHints = sel.TableHints[:originalHintsLen] - }() if state.leading2[0] != nil && state.leading2[1] != nil { leadingHint := &ast.TableOptimizerHint{ HintName: ast.NewCIStr(hint.HintLeading),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if sel, isSel := stmt.(*ast.SelectStmt); isSel {

hasIndexHint := false

for _, indexHint := range state.indexHints {

if indexHint != nil {

hasIndexHint = true

break

}

}

if (state.leading2[0] != nil && state.leading2[1] != nil) || hasIndexHint {

originalHintsLen := len(sel.TableHints)

defer func() {

sel.TableHints = sel.TableHints[:originalHintsLen]

}()

if state.leading2[0] != nil && state.leading2[1] != nil {

leadingHint := &ast.TableOptimizerHint{

HintName: ast.NewCIStr(hint.HintLeading),

Tables: []ast.HintTable{

{

DBName: ast.NewCIStr(state.leading2[0].schema),

TableName: ast.NewCIStr(state.leading2[0].HintName()),

},

{

DBName: ast.NewCIStr(state.leading2[1].schema),

TableName: ast.NewCIStr(state.leading2[1].HintName()),

},

},

}

sel.TableHints = append(sel.TableHints, leadingHint)

}

for _, indexHint := range state.indexHints {

if indexHint == nil {

continue

}

hintNode := &ast.TableOptimizerHint{

HintName: ast.NewCIStr(hint.HintUseIndex),

Tables: []ast.HintTable{

{

DBName: ast.NewCIStr(indexHint.table.schema),

TableName: ast.NewCIStr(indexHint.table.HintName()),

},

},

Indexes: []ast.CIStr{ast.NewCIStr(indexHint.index)},

}

sel.TableHints = append(sel.TableHints, hintNode)

}

}

if state.noDecorrelateQB.L != "" {

noDecorrelateHint := &ast.TableOptimizerHint{

HintName: ast.NewCIStr(hint.HintNoDecorrelate),

QBName: state.noDecorrelateQB,

}

sel.TableHints = append(sel.TableHints, noDecorrelateHint)

}

if sel, isSel := stmt.(*ast.SelectStmt); isSel {

originalHintsLen := len(sel.TableHints)

defer func() {

sel.TableHints = sel.TableHints[:originalHintsLen]

}()

hasIndexHint := false

for _, indexHint := range state.indexHints {

if indexHint != nil {

hasIndexHint = true

break

}

}

if (state.leading2[0] != nil && state.leading2[1] != nil) || hasIndexHint {

if state.leading2[0] != nil && state.leading2[1] != nil {

leadingHint := &ast.TableOptimizerHint{

HintName: ast.NewCIStr(hint.HintLeading),

Tables: []ast.HintTable{

{

DBName: ast.NewCIStr(state.leading2[0].schema),

TableName: ast.NewCIStr(state.leading2[0].HintName()),

},

{

DBName: ast.NewCIStr(state.leading2[1].schema),

TableName: ast.NewCIStr(state.leading2[1].HintName()),

},

},

}

sel.TableHints = append(sel.TableHints, leadingHint)

}

for _, indexHint := range state.indexHints {

if indexHint == nil {

continue

}

hintNode := &ast.TableOptimizerHint{

HintName: ast.NewCIStr(hint.HintUseIndex),

Tables: []ast.HintTable{

{

DBName: ast.NewCIStr(indexHint.table.schema),

TableName: ast.NewCIStr(indexHint.table.HintName()),

},

},

Indexes: []ast.CIStr{ast.NewCIStr(indexHint.index)},

}

sel.TableHints = append(sel.TableHints, hintNode)

}

}

if state.noDecorrelateQB.L != "" {

noDecorrelateHint := &ast.TableOptimizerHint{

HintName: ast.NewCIStr(hint.HintNoDecorrelate),

QBName: state.noDecorrelateQB,

}

sel.TableHints = append(sel.TableHints, noDecorrelateHint)

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/bindinfo/binding_plan_generation.go` around lines 462 - 514, The sel.TableHints slice is not restored after every explored state because the noDecorrelate hint (state.noDecorrelateQB) is appended outside the rollback/defer that restores sel.TableHints, causing hint leakage across BFS states; fix it in binding_plan_generation.go by moving the originalHintsLen capture and the defer restoration (that slices sel.TableHints back to originalHintsLen) to cover both the leading/index-hint block and the noDecorrelate append, or alternatively move the noDecorrelate hint append inside the existing defer scope so that any addition to sel.TableHints (including the TableOptimizerHint created for state.noDecorrelateQB) is rolled back after exploring the state; reference sel.TableHints, state.noDecorrelateQB, state.leading2 and state.indexHints to locate the changes.

coderabbitai · 2026-05-29T07:42:01Z

+// extractSelectTableNames returns the table names in the SELECT statement.
+func extractSelectTableNames(defaultSchema string, node ast.StmtNode) []*tableName {
+	selStmt, isSel := node.(*ast.SelectStmt)
+	if !isSel {
+		return nil // only support SELECT statement for now
+	}
+	extractor := &tableNameExtractor{
+		defaultSchema: defaultSchema,
+		tableNames:    make(map[string]*tableName),
+	}
+	selStmt.Accept(extractor)
+
+	names := make([]*tableName, 0, len(extractor.tableNames))
+	for _, name := range extractor.tableNames {
+		names = append(names, name)
+	}
+	sort.Slice(names, func(i, j int) bool {
+		return names[i].String() < names[j].String()
+	})
+	return names
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Limit LEADING() candidates to the current query block.

extractSelectTableNames walks the full AST, so subquery tables are included in possibleLeading2. The generated LEADING() hint is then attached to the outer SELECT, where those inner tables are out of scope. For the subquery-heavy statements this generator targets, that burns BFS states on invalid/no-op hints and can crowd out useful candidates before hitting the 10k-state cap.

Suggested direction

-func extractSelectTableNames(defaultSchema string, node ast.StmtNode) []*tableName { - selStmt, isSel := node.(*ast.SelectStmt) - if !isSel { - return nil // only support SELECT statement for now - } - extractor := &tableNameExtractor{ - defaultSchema: defaultSchema, - tableNames: make(map[string]*tableName), - } - selStmt.Accept(extractor) - - names := make([]*tableName, 0, len(extractor.tableNames)) - for _, name := range extractor.tableNames { - names = append(names, name) - } - sort.Slice(names, func(i, j int) bool { - return names[i].String() < names[j].String() - }) - return names +func extractSelectTableNames(defaultSchema string, node ast.StmtNode) []*tableName { + selStmt, isSel := node.(*ast.SelectStmt) + if !isSel { + return nil + } + return extractSelectTableNamesWithAlias(defaultSchema, selStmt) }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/bindinfo/binding_plan_generation.go` around lines 755 - 775, extractSelectTableNames currently walks the entire SELECT AST (via selStmt.Accept(extractor)) and thus collects table names from nested subqueries; change the tableNameExtractor so it only collects tables from the current/top-level query block by not descending into nested SelectStmt nodes (e.g. add a depth/flag on tableNameExtractor or make its Visit/Enter methods return Skip for child traversal when encountering a *ast.SelectStmt that is not the root), so extractSelectTableNames (and its call selStmt.Accept(extractor)) will only produce table names in scope for the outer SELECT and exclude subquery tables from possibleLeading2.

coderabbitai · 2026-05-29T07:42:01Z

+<<<<<<< HEAD
        "rule_collect_plan_stats.go",
        "rule_column_pruning.go",
+=======
+        "rule_correlate.go",
+>>>>>>> 7357a2e2f90 (planner: correlate subquery rule (#66206))


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Resolve the leftover merge conflict in the srcs list.

The <<<<<<< / ======= / >>>>>>> markers are still present here, so this Bazel file will not parse and the cherry-pick cannot build.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/BUILD.bazel` around lines 56 - 61, Remove the leftover merge conflict markers (<<<<<<<, =======, >>>>>>>) in the BUILD.bazel srcs list and reconcile the entries so the list contains the intended source files (include "rule_collect_plan_stats.go", "rule_column_pruning.go" and "rule_correlate.go" as appropriate), ensuring the srcs array is a valid, comma-separated string list with no conflict markers remaining.

coderabbitai · 2026-05-29T07:42:01Z

+<<<<<<< HEAD
+=======
+        "rule_cdc_join_reorder_test.go",
+        "rule_common_handle_ordering_test.go",
+        "rule_correlate_test.go",
+>>>>>>> 7357a2e2f90 (planner: correlate subquery rule (#66206))


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Resolve the merge conflict markers before merging.

This target still has unresolved <<<<<<< / ======= / >>>>>>> blocks, so Bazel cannot load it and the rule tests will not build.

Also applies to: 24-28

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/casetest/rule/BUILD.bazel` around lines 8 - 13, The BUILD.bazel target contains unresolved Git merge conflict markers (<<<<<<<, =======, >>>>>>>) around the test file list; remove these markers and merge the two variants into a single, valid list (e.g., include "rule_cdc_join_reorder_test.go", "rule_common_handle_ordering_test.go", "rule_correlate_test.go" along with the other files) so the target is a proper Bazel srcs array and the rule tests (the files named in the conflict) can build.

coderabbitai · 2026-05-29T07:42:01Z

+<<<<<<< HEAD
 	testDataMap.LoadTestSuiteData("testdata", "outer2inner")
 	testDataMap.LoadTestSuiteData("testdata", "derive_topn_from_window")
 	testDataMap.LoadTestSuiteData("testdata", "join_reorder_suite")
 	testDataMap.LoadTestSuiteData("testdata", "predicate_pushdown_suite")
 	testDataMap.LoadTestSuiteData("testdata", "predicate_simplification")
+=======
+	testDataMap.LoadTestSuiteData("testdata", "outer2inner", true)
+	testDataMap.LoadTestSuiteData("testdata", "derive_topn_from_window", true)
+	testDataMap.LoadTestSuiteData("testdata", "join_reorder_suite", true)
+	testDataMap.LoadTestSuiteData("testdata", "predicate_pushdown_suite", true)
+	testDataMap.LoadTestSuiteData("testdata", "predicate_simplification", true)
+	testDataMap.LoadTestSuiteData("testdata", "outer_to_semi_join_suite", true)
+	testDataMap.LoadTestSuiteData("testdata", "correlate_suite", true)
+	testDataMap.LoadTestSuiteData("testdata", "cdc_join_reorder_suite", true)
+	testDataMap.LoadTestSuiteData("testdata", "order_aware_join_reorder_suite", true)
+
+>>>>>>> 7357a2e2f90 (planner: correlate subquery rule (#66206))


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Resolve the merge conflict markers in TestMain and the suite helpers.

The <<<<<<< / ======= / >>>>>>> markers are still in this Go file, so the test package will not compile.

Also applies to: 87-105

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/casetest/rule/main_test.go` around lines 32 - 49, Remove the leftover merge conflict markers and fix TestMain to a single coherent block: delete the <<<<<<<, =======, and >>>>>>> markers and keep the intended LoadTestSuiteData calls (use the variant with the boolean flag, e.g. LoadTestSuiteData("testdata", "outer2inner", true) and the additional suites like "outer_to_semi_join_suite", "correlate_suite", "cdc_join_reorder_suite", "order_aware_join_reorder_suite"); ensure the TestMain function and any suite helper loops use the correct LoadTestSuiteData signature (the version accepting the bool) and remove duplicated or conflicting lines so the file compiles.

ti-chi-bot · 2026-06-03T14:53:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from terry1purcell. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-06-03T14:57:34Z

@ti-chi-bot: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
idc-jenkins-ci-tidb/check_dev_2	`7c206d7`	link	true	`/test check-dev2`
idc-jenkins-ci-tidb/unit-test	`7c206d7`	link	true	`/test unit-test`
idc-jenkins-ci-tidb/check_dev	`7c206d7`	link	true	`/test check-dev`
idc-jenkins-ci-tidb/mysql-test	`7c206d7`	link	true	`/test mysql-test`
idc-jenkins-ci-tidb/build	`7c206d7`	link	true	`/test build`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

tiprow · 2026-06-03T15:31:44Z

@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
fast_test_tiprow_for_release	`7c206d7`	link	true	`/test fast_test_tiprow_for_release`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

This is an automated cherry-pick of pingcap#66206

3f03dc3

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot mentioned this pull request May 29, 2026

planner: correlate subquery rule #66206

Merged

13 tasks

ti-chi-bot Bot added the do-not-merge/cherry-pick-not-approved label May 29, 2026

ti-chi-bot assigned terry1purcell May 29, 2026

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

ti-chi-bot Bot added cherry-pick-approved Cherry pick PR approved by release team. and removed do-not-merge/cherry-pick-not-approved labels May 29, 2026

merge conflicts"

7c206d7

Conversation

ti-chi-bot commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot commented May 29, 2026

Uh oh!

ti-chi-bot Bot commented May 29, 2026

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning, 2 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ti-chi-bot Bot commented Jun 3, 2026

Uh oh!

ti-chi-bot Bot commented Jun 3, 2026

Uh oh!

tiprow Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ti-chi-bot commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading