planner: correlate subquery rule by terry1purcell · Pull Request #66206 · pingcap/tidb

terry1purcell · 2026-02-11T05:07:37Z

What problem does this PR solve?

Issue Number: close #66320

Problem Summary:

What changed and how does it work?

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

New Features
- Introduces an alternative optimization round that can convert eligible semi-joins into correlated apply-style plans, improving planner choice between join vs apply and preserving correct NULL semantics for correlated subqueries.
Tests
- Adds comprehensive test suites and fixtures covering correlated subquery behaviors (IN/EXISTS/NOT IN), null semantics, plan shapes (EXPLAIN), parallel apply, and cost-factor-driven plan selection across planner modes.

pantheon-ai · 2026-02-11T05:07:48Z

General (non-inline) review: systemic optimizer issues surfaced by the new correlate rule

This PR adds a new logical rule CorrelateSolver (semi-join -> correlated Apply) gated by tidb_opt_enable_correlate_subquery. While reviewing the optimizer plumbing around it, I verified two pre-existing issues in pkg/planner/core/optimizer.go that are not touched by this PR diff but become more important with another late-stage rewrite rule.

P1: Predicate push down cannot reliably participate in follow-up or interaction optimizations

Registration and order: pkg/planner/core/optimizer.go:97 (&PPDSolver{} runs before the new correlate rewrite)
Change signal bug: pkg/planner/core/rule_predicate_push_down.go:43 always returns planChanged=false even when PredicatePushDown mutates or rebuilds the plan
Why it matters here: CorrelateSolver introduces new inner Selection predicates after PPD has already run, but the framework has no reliable way to trigger a second PPD (or any interaction) pass based on PPD changes. This risks leaving the inner side with avoidable full scans or missing access condition derivation when the correlate rule is enabled.

Recommendation:

Make PPDSolver.Optimize return an accurate planChanged (or conservatively true), or extend LogicalPlan.PredicatePushDown to return a change flag.
Consider a follow-up pass when FlagCorrelate is enabled (for example rerun PPD on the rewritten subtree) so the new correlated predicates can be used for access path selection.

P2: FlagPruneColumnsAgain derivation is fragile and computed before conditional flags are added

pkg/planner/core/optimizer.go:347 computes FlagPruneColumnsAgain via flag-rule.FlagPruneColumns > rule.FlagPruneColumns early in adjustOptimizationFlags.
Why it matters here: this PR adds another conditional flag (FlagCorrelate) inside adjustOptimizationFlags. Any future conditional flag added after the prune-again check can make the final flag set inconsistent with the intended "something after prune columns" rule, and the arithmetic check is hard to reason about when maintaining flag order.

Recommendation:

Recompute the prune-again condition at the end of adjustOptimizationFlags using explicit bit operations, after all flag |= ... and flag &= ... mutations.
Add a small unit test that asserts len(optRuleList) matches the number and order of rule.Flag* bits to prevent silent misalignment when adding new rules.

tiprow · 2026-02-11T05:07:54Z

Hi @terry1purcell. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

codecov · 2026-02-11T14:56:25Z

Codecov Report

❌ Patch coverage is 79.33884% with 100 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.1728%. Comparing base (ca3c902) to head (be086e8).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #66206        +/-   ##
================================================
+ Coverage   77.7850%   78.1728%   +0.3877%     
================================================
  Files          1983       1976         -7     
  Lines        549122     550035       +913     
================================================
+ Hits         427135     429978      +2843     
+ Misses       121067     119627      -1440     
+ Partials        920        430       -490

Flag	Coverage Δ
integration	`43.6751% <6.6666%> (+3.8779%)`	⬆️
unit	`76.7157% <79.3388%> (+0.3746%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`66.1193% <ø> (+4.6127%)`	⬆️
parser	`∅ <ø> (∅)`
br	`50.0681% <ø> (-13.0316%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

terry1purcell · 2026-02-11T15:42:23Z

/ok-to-test

Copilot

Pull request overview

Adds an (optional) logical-optimizer rule to “re-correlate” certain semi-joins back into correlated Apply, gated by a new session/global system variable. This aims to enable correlated index lookups when they’re expected to outperform a hash semi-join.

Changes:

Introduce CorrelateSolver logical rule and wire it into the logical optimizer behind a new FlagCorrelate.
Add sysvar + session var plumbing for tidb_opt_enable_correlate_subquery.
Add casetests for the rule and a unit test to ensure optRuleList and Flag* alignment.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
pkg/sessionctx/variable/sysvar.go	Registers `tidb_opt_enable_correlate_subquery` sysvar and applies it to `SessionVars`.
pkg/sessionctx/variable/session.go	Adds `EnableCorrelateSubquery` to `SessionVars` and initializes it.
pkg/sessionctx/vardef/tidb_vars.go	Defines sysvar name constant and default value.
pkg/planner/core/rule_correlate.go	Implements the new semi-join → correlated `Apply` transformation rule.
pkg/planner/core/rule/logical_rules.go	Adds `FlagCorrelate` to the logical rule flag bitset.
pkg/planner/core/optimizer.go	Appends `CorrelateSolver` to `optRuleList` and conditionally enables `FlagCorrelate`; adjusts `FlagPruneColumnsAgain` computation.
pkg/planner/core/optimizer_test.go	Adds a test to detect misalignment between `optRuleList` and `Flag*` constants.
pkg/planner/core/casetest/rule/testdata/correlate_suite_in.json	Input SQLs for correlate rule casetests.
pkg/planner/core/casetest/rule/testdata/correlate_suite_out.json	Expected plans/results (non-xunit) for casetests.
pkg/planner/core/casetest/rule/testdata/correlate_suite_xut.json	Expected plans/results (xunit) for casetests.
pkg/planner/core/casetest/rule/rule_correlate_test.go	Adds correlate rule casetest runner and enables the sysvar during tests.
pkg/planner/core/casetest/rule/main_test.go	Loads the new correlate test suite data.
pkg/planner/core/casetest/rule/BUILD.bazel	Includes the new test file and updates shard count.
pkg/planner/core/BUILD.bazel	Adds `rule_correlate.go` to the core library build.

terry1purcell · 2026-02-11T18:48:15Z

@pantheon-bot please review

pantheon-ai · 2026-02-11T18:48:22Z

Re-review note: this PR has 3 new commits since my last pass. I rechecked the correlate-subquery rewrite path (gated by tidb_opt_enable_correlate_subquery / EnableCorrelateSubquery). The issues below are verified on the current diff.

Severity note: all findings are P2 mainly because the feature is default OFF. For the executor-side memory risk, the statement memory tracker (default 1GB, action CANCEL) provides a safety net, but the amplification can still be severe for sessions that enable the flag.

P2-1: OOM/DoS risk - Apply materializes full inner result per outer row

What happens
CorrelateSolver rewrites eligible semi-joins to LogicalApply and does not introduce a Limit or other short-circuit for EXISTS/SEMI semantics. At execution time, NestedLoopApplyExec can read and materialize the entire inner side for each outer row (or outer cache key) into a chunk.List.

Evidence

pkg/planner/core/rule_correlate.go:116 (builds LogicalApply with the inner plan; no Limit/short-circuit introduced)
pkg/executor/join/hash_join_v1.go:1295 (NestedLoopApplyExec.fetchAllInners loops exec.Next and InnerList.AppendRow for all selected inner rows)

Why it matters
For non-selective correlation keys, duplicate outer keys, or missing/ineffective indexes on the inner side, this becomes O(outer_rows * inner_rows) work and can amplify memory usage per outer row/key. Even with the statement memory tracker, this is a CPU and memory DoS vector for workloads that enable the flag.

Recommendations

For EXISTS/SEMI semantics, stop reading inner after the first match and avoid materializing all inner rows.
Consider inserting/propagating a logical Limit 1 (or equivalent physical short-circuit) for EXISTS-style inner plans.
Add guardrails: avoid Apply when the chosen inner access path is a full scan or when estimated inner rows per key is large; keep/fall back to the set-based semi join.
Add stress/regression tests that cover duplicated outer keys and non-selective correlations.

P2-2: Unconditional rewrite without cost or access-path guard

What happens
Once the shape checks pass, the rule always returns the rewritten LogicalApply, without considering cost, selectivity, or whether the inner subtree has a viable access path driven by the correlated predicates.

Evidence

pkg/planner/core/rule_correlate.go:120 (returns ap, true unconditionally after checks)

Why it matters
This can regress decorrelatable EXISTS/NOT EXISTS from a set-based hash semi-join to per-row Apply, especially when the inner side cannot use the correlated predicate to pick an index/range access path. That is a large performance risk for any session that enables the knob.

Recommendations

Add a cost-based decision: only rewrite when estimated outer rows are small and/or the inner plan has an index/range access using the correlated columns.
If stats are missing/unknown, be conservative and keep the original semi-join.
Consider keeping both alternatives through logical/physical optimization and selecting by cost.

P2-3: Join hint / preference metadata dropped during transform

What happens
The transform constructs a fresh LogicalApply and copies only JoinType/CorCols/children/schema/output names. Join hint state and join-preference metadata from the original semi-join are not propagated.

Evidence

pkg/planner/core/rule_correlate.go:113 (new LogicalApply{}.Init(...) without copying hint fields)

Why it matters
When the correlate-subquery rewrite is enabled, user-provided join hints or join ordering preferences can be silently ignored, leading to surprising plans and making performance tuning harder.

Recommendations

Copy join hint related fields from the original join node to LogicalApply (for example HintInfo, PreferJoinType, PreferJoinOrder), or ensure they are stored in shared state that survives transforms.
Add a planner test asserting hints remain effective after the rewrite.

P2-4: Missing relevant optimizer variable tracking

What happens
The session variable tidb_opt_enable_correlate_subquery / EnableCorrelateSubquery affects optimization flags, but it is not recorded via RecordRelevantOptVar.

Evidence

pkg/planner/core/optimizer.go:363 (reads EnableCorrelateSubquery to set rule.FlagCorrelate, but does not record it)

Why it matters
Plan replayer dumps all sysvars, so the debuggability impact is reduced, but the relevant optimizer vars reporting becomes incomplete, which can hinder triage when comparing plans.

Recommendations

Add RecordRelevantOptVar for this variable when it influences optimization flags, consistent with other optimizer knobs.

If helpful, I can sketch a minimal fix set: (a) propagate hint fields in the rewrite, (b) add RecordRelevantOptVar, and (c) add a conservative cost/access-path check plus executor early-exit for EXISTS semantics to reduce worst-case amplification.

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated no new comments.

yudongusa

Please open a document PR on this

ti-chi-bot · 2026-04-21T04:13:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qw4990, yudongusa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [qw4990]
~~pkg/bindinfo/OWNERS~~ [qw4990]
~~pkg/planner/OWNERS~~ [qw4990]
~~pkg/sessionctx/variable/OWNERS~~ [yudongusa]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-04-21T04:13:26Z

[LGTM Timeline notifier]

Timeline:

2026-04-20 08:41:58.826073601 +0000 UTC m=+1982524.031433658: ☑️ agreed by qw4990.
2026-04-21 04:13:24.972335572 +0000 UTC m=+2052810.177695629: ☑️ agreed by yudongusa.

terry1purcell · 2026-04-21T05:11:26Z

/retest-required

terry1purcell · 2026-04-21T05:17:20Z

/retest-required

terry1purcell · 2026-04-21T05:51:36Z

/retest-required

terry1purcell · 2026-04-21T06:28:49Z

/retest-required

terry1purcell · 2026-04-21T08:12:39Z

/retest-required

terry1purcell · 2026-04-21T10:24:36Z

/retest-required

terry1purcell · 2026-04-21T10:42:08Z

/retest-required

close pingcap#66320

ti-chi-bot · 2026-05-29T07:26:20Z

In response to a cherrypick label: new pull request created to branch release-8.5: #68752.
But this PR has conflicts, please resolve them!

close pingcap#66320

planner: correlate subquery rule (wip)

da93b6f

ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. sig/planner SIG: Planner labels Feb 11, 2026

review comments1

171333a

ti-chi-bot Bot added the ok-to-test Indicates a PR is ready to be tested. label Feb 11, 2026

terry1purcell requested a review from Copilot February 11, 2026 15:42

Copilot started reviewing on behalf of terry1purcell February 11, 2026 15:43 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

Comment thread pkg/planner/core/rule_correlate.go

Comment thread pkg/sessionctx/variable/sysvar.go Outdated

terry1purcell added 2 commits February 11, 2026 07:59

add testcases1

fca2997

add testcases2

1d7e38a

ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 11, 2026

terry1purcell and others added 4 commits February 19, 2026 09:01

Merge branch 'master' into correlate

0f6c877

refactor for order

62df8f8

refactor for cost based evaluation

5ba3cca

refactor for 2nd customer example

a0631a5

ti-chi-bot Bot added do-not-merge/needs-triage-completed and removed do-not-merge/needs-linked-issue labels Feb 21, 2026

terry1purcell changed the title ~~planner: correlate subquery rule (wip)~~ planner: correlate subquery rule Feb 21, 2026

terry1purcell requested a review from Copilot February 21, 2026 00:21

Copilot started reviewing on behalf of terry1purcell February 21, 2026 00:21 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

terry1purcell added 2 commits April 21, 2026 08:54

review comments2

dd0f84f

review comments3

448665b

yudongusa approved these changes Apr 21, 2026

View reviewed changes

ti-chi-bot Bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Apr 21, 2026

terry1purcell and others added 2 commits April 21, 2026 14:23

Merge branch 'master' into correlate

34bbc9e

update bazel

be086e8

ti-chi-bot Bot merged commit 7357a2e into pingcap:master Apr 21, 2026
35 checks passed

terry1purcell deleted the correlate branch April 22, 2026 00:03

premal pushed a commit to premal/tidb that referenced this pull request Apr 30, 2026

planner: correlate subquery rule (pingcap#66206)

16efd21

close pingcap#66320

AilinKid pushed a commit to AilinKid/tidb that referenced this pull request May 14, 2026

planner: correlate subquery rule (pingcap#66206)

3f32965

close pingcap#66320

AilinKid pushed a commit to AilinKid/tidb that referenced this pull request May 14, 2026

planner: correlate subquery rule (pingcap#66206)

9275ad1

close pingcap#66320

AilinKid pushed a commit to AilinKid/tidb that referenced this pull request May 15, 2026

planner: correlate subquery rule (pingcap#66206)

940fbbd

close pingcap#66320

AilinKid mentioned this pull request May 15, 2026

planner: cherry-pick multi alternative framework and fts like rewrite back to fts branch | tidb-test=42c2474b6e2e1ef3430d07a1743ca7e17fd97acc tiflash=feature-fts tikv=feature-fts #68383

Merged

13 tasks

AilinKid pushed a commit to AilinKid/tidb that referenced this pull request May 15, 2026

planner: correlate subquery rule (pingcap#66206)

8301e22

close pingcap#66320

ti-chi-bot Bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label May 29, 2026

ti-chi-bot mentioned this pull request May 29, 2026

planner: correlate subquery rule (#66206) #68752

Open

13 tasks

AilinKid mentioned this pull request May 30, 2026

planner: backport alternative logical plan improvements to release-8.5.6 | tidb-test=pr/2749 pd=v8.5.6 tikv=v8.5.6 tiflash=v8.5.6 #68796

Merged

yongman pushed a commit to yongman/tidb that referenced this pull request Jun 3, 2026

planner: correlate subquery rule (pingcap#66206)

a12d623

close pingcap#66320

Conversation

terry1purcell commented Feb 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

pantheon-ai Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiprow Bot commented Feb 11, 2026

Uh oh!

codecov Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

terry1purcell commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

terry1purcell commented Feb 11, 2026

Uh oh!

pantheon-ai Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

P2-1: OOM/DoS risk - Apply materializes full inner result per outer row

P2-2: Unconditional rewrite without cost or access-path guard

P2-3: Join hint / preference metadata dropped during transform

P2-4: Missing relevant optimizer variable tracking

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

yudongusa left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

Uh oh!

ti-chi-bot Bot commented Apr 21, 2026

[LGTM Timeline notifier]

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

terry1purcell commented Apr 21, 2026

Uh oh!

Uh oh!

ti-chi-bot commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

terry1purcell commented Feb 11, 2026 •

edited by coderabbitai Bot

Loading

pantheon-ai Bot commented Feb 11, 2026 •

edited

Loading

codecov Bot commented Feb 11, 2026 •

edited

Loading

pantheon-ai Bot commented Feb 11, 2026 •

edited

Loading