Skip to content

planner: add opt-in greedy join reorder seed-by-cost switch#67784

Closed
Reminiscent wants to merge 1 commit into
pingcap:masterfrom
Reminiscent:greedy
Closed

planner: add opt-in greedy join reorder seed-by-cost switch#67784
Reminiscent wants to merge 1 commit into
pingcap:masterfrom
Reminiscent:greedy

Conversation

@Reminiscent
Copy link
Copy Markdown
Contributor

@Reminiscent Reminiscent commented Apr 15, 2026

What problem does this PR solve?

Issue Number: close #67774

Problem Summary:

Greedy join reorder currently seeds each connected join component from the smallest base relation.

That heuristic is not always good enough for outer-join groups. In a query like:

  select * from
    (select t1.id, t1.col1 from t1 left join t2 on t1.col1 = t2.col2) tt
  join t3 on t3.col1 = tt.col1;

when t2 < t3 << t1, the greedy path may start from t2, which effectively picks t1 left join t2 as the first join even though the outer join cannot reduce below t1's row
count.

The better first choice is to join t3 with t1 first and then left-join t2.

What changed and how does it work?

This PR adds an opt-in system variable:

  • tidb_opt_greedy_join_seed_by_cost

When this variable is ON, greedy join reorder no longer always starts from the smallest base node. Instead, it enumerates valid first join pairs inside each connected
component and uses the pair with the smallest cumulative join cost as the seed.

The change is applied to both:

  • the default advanced join reorder path in pkg/planner/core/joinorder
  • the legacy greedy fallback path in pkg/planner/core

A new integration regression test is added to show:

  • OFF: keep the current plan shape for compatibility
  • ON: choose the cheaper t3 + t1 seed first
  • legacy fallback (tidb_opt_enable_advanced_join_reorder = 0) matches the same behavior

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • New Features

    • Added a boolean system/session variable to optionally start greedy join reordering by estimated cost, which can change chosen join order and access methods; toggleable per-session or globally.
  • Behavior Changes

    • Reseeding of greedy join selection now prefers cost-based seeds within connected components and respects cartesian/penalty rules, affecting final join trees.
  • Bug Fixes

    • Prevented planner-side mutations of input plans during speculative join checks.
  • Tests

    • Added unit and integration tests covering the new behavior, hints, and regression scenarios.

@ti-chi-bot ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed labels Apr 15, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented Apr 15, 2026

@Reminiscent I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added sig/planner SIG: Planner size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 15, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign terry1purcell for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Apr 15, 2026

Hi @Reminiscent. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an opt-in cost-based reseeding path to the greedy join reorder: a new session/sysvar gates selecting the cheapest leaf–leaf join within a connected component as the initial seed. Also prevents speculative connectivity checks from mutating leaf plans and adds tests and plumbing for the new behavior.

Changes

Cohort / File(s) Summary
Greedy join-order algorithm
pkg/planner/core/joinorder/join_order.go
Add startByCost and lockedSeedNode parameters to greedyConnectJoinNodes; conditional reseeding for single-leaf starts; new seedGreedyJoinComponentByCost and collectGreedySeedComponentIndices.
Greedy join reordering solver
pkg/planner/core/rule_join_reorder_greedy.go, pkg/planner/core/rule_join_reorder.go
Introduce jrNode.leafCnt; wire cost-based seeding via seedConnectedJoinTreeByCost; add collectConnectedSeedComponentIndices, hasNonCartesianSeedConnection, and extract betterLegacyGreedyCandidate; update leaf count propagation.
Conflict detection / join construction
pkg/planner/core/joinorder/conflict_detector.go
Add (*CheckConnectionResult).HasJoinCondition(); use InjectExprAvoidingMutation and avoid writing aligned plans back into original node*.p to prevent side effects during speculative join construction.
Projection helper
pkg/planner/core/operator/logicalop/logical_projection.go
Add InjectExprAvoidingMutation to inject an expression into a LogicalProjection without mutating the original projection instance.
Session variable plumbing
pkg/sessionctx/vardef/tidb_vars.go, pkg/sessionctx/variable/session.go, pkg/sessionctx/variable/sysvar.go, pkg/sessionctx/variable/setvar_affect.go
Introduce tidb_opt_greedy_join_reorder_start_by_cost (default OFF); add field to SessionVars, register sysvar and make it hint-updatable.
Tests — unit & integration
pkg/sessionctx/variable/varsutil_test.go, pkg/planner/core/join_reorder_side_effect_test.go, pkg/planner/core/joinorder/join_order_side_effect_test.go, tests/integrationtest/t/planner/core/join_reorder2.test, tests/integrationtest/r/planner/core/join_reorder2.result
Add sysvar unit test; add side‑effect/regression tests ensuring speculative connectivity and reseeding do not mutate leaf plans; extend integration tests covering start-by-cost, LEADING interactions, component boundaries, cartesian-penalty cases and expected plan outputs.
Build/test targets
pkg/planner/core/BUILD.bazel, pkg/planner/core/joinorder/BUILD.bazel
Add new test sources and update test dependencies for added tests.
Minor solver change
pkg/planner/core/rule_join_reorder.go
Initialize jrNode.leafCnt = 1 for join-group leaves in generateJoinOrderNode.

Sequence Diagram(s)

sequenceDiagram
    participant Session as SessionVars
    participant Reorder as GreedyReorder
    participant Conflict as ConflictDetector
    participant Seeder as SeedSelector
    participant JoinTree as JoinTreeBuilder

    Session->>Reorder: read TiDBOptGreedyJoinReorderStartByCost
    Reorder->>Conflict: collect connectivity / component indices
    Conflict-->>Reorder: CheckConnectionResult (Connected, HasJoinCondition)
    Reorder->>Seeder: request candidate leaf-leaf pairs (component)
    Seeder->>Reorder: best pair by cumulative join cost
    Reorder->>JoinTree: initialize/join seed pair (or fallback to legacy seed)
    JoinTree-->>Reorder: jrNode with updated leafCnt
    Reorder->>Reorder: continue greedy expansion (betterLegacyGreedyCandidate)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

ok-to-test, approved, lgtm

Suggested reviewers

  • guo-shaoge
  • qw4990
  • AilinKid

Poem

🐰 I nudged the seeds where pennies lie,
I peeked at costs and gave them a try.
I cloned a proj so the leaf stayed whole,
I hopped the joins to play my role.
Hooray — a rabbit’s tweak for planner soul.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR addresses the issue #67774 objective: it implements the opt-in system variable tidb_opt_greedy_join_reorder_start_by_cost to enable cost-based first-join selection in both advanced and legacy greedy paths, avoiding suboptimal seeding for outer-join groups.
Out of Scope Changes check ✅ Passed All code changes are directly related to implementing the greedy join reorder start-by-cost feature: system variable definition, session handling, connectivity logic enhancements, and comprehensive test coverage; no unrelated changes are present.
Title check ✅ Passed The PR title accurately summarizes the main change: introducing an opt-in system variable to enable cost-based seed selection in greedy join reordering, replacing the default smallest-base-relation heuristic.
Description check ✅ Passed The PR description follows the required template structure with Issue Number, Problem Summary, What Changed and How It Works sections. All critical sections are present and completed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
pkg/sessionctx/variable/session.go (1)

1267-1270: Initialize this new session flag explicitly in NewSessionVars.

Line 1267 adds a new optimizer flag, but its default currently depends on Go’s zero-value. Please wire it to vardef.DefTiDBOptGreedyJoinReorderStartByCost in the constructor for consistency with adjacent optimizer vars and future-proofing.

♻️ Suggested change
@@ func NewSessionVars(hctx HookContext) *SessionVars {
 		TiDBOptJoinReorderThreshold:      vardef.DefTiDBOptJoinReorderThreshold,
 		TiDBOptEnableAdvancedJoinReorder: vardef.DefTiDBOptEnableAdvancedJoinReorder,
+		TiDBOptGreedyJoinReorderStartByCost: vardef.DefTiDBOptGreedyJoinReorderStartByCost,
 		TiDBOptJoinReorderThroughSel:     vardef.DefTiDBOptJoinReorderThroughSel,

As per coding guidelines, “Code SHOULD remain maintainable for future readers” and “Follow existing package-local conventions first”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/sessionctx/variable/session.go` around lines 1267 - 1270, The new session
flag TiDBOptGreedyJoinReorderStartByCost is currently left at Go's zero value;
update NewSessionVars to explicitly initialize this field using
vardef.DefTiDBOptGreedyJoinReorderStartByCost so it follows the pattern of
adjacent optimizer vars (e.g., set TiDBOptGreedyJoinReorderStartByCost:
vardef.DefTiDBOptGreedyJoinReorderStartByCost in the session vars constructor)
ensuring consistent defaults and future-proofing.
pkg/sessionctx/vardef/tidb_vars.go (1)

671-675: Clarify the previous heuristic wording in the comment.

"smallest base-node cumulative cost" reads a bit ambiguous; consider phrasing it as seeding from the “smallest base relation” to match the planner terminology used in this area.

✏️ Suggested wording tweak
- // instead of the smallest base-node cumulative cost. It is kept opt-in so
+ // instead of seeding from the smallest base relation. It is kept opt-in so

As per coding guidelines, comments SHOULD explain non-obvious intent clearly for maintainers with basic TiDB familiarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/sessionctx/vardef/tidb_vars.go` around lines 671 - 675, The comment for
TiDBOptGreedyJoinReorderStartByCost is ambiguous — replace the phrase "smallest
base-node cumulative cost" with clearer planner terminology such as "seeding
from the smallest base relation (i.e., the base-node with the smallest
cumulative cost)" so it reads that the greedy join reorder picks the first join
in a connected component by the cheapest join cost instead of seeding from the
smallest base relation; update the comment above the
TiDBOptGreedyJoinReorderStartByCost constant accordingly to use this phrasing.
tests/integrationtest/t/planner/core/join_reorder2.test (1)

100-114: Consider resetting all touched session vars at the end for better isolation.

You already restore tidb_opt_enable_advanced_join_reorder and tidb_opt_greedy_join_reorder_start_by_cost; restoring tidb_enable_outer_join_reorder too would make this block safer if future cases are appended below.

🧹 Optional cleanup tail
 set @@tidb_opt_enable_advanced_join_reorder = 1;
 set @@tidb_opt_greedy_join_reorder_start_by_cost = 0;
+set @@tidb_enable_outer_join_reorder = default;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integrationtest/t/planner/core/join_reorder2.test` around lines 100 -
114, Add a final session reset for tidb_enable_outer_join_reorder to restore
test isolation: after the existing resets for
tidb_opt_enable_advanced_join_reorder and
tidb_opt_greedy_join_reorder_start_by_cost, add a statement to set
@@tidb_enable_outer_join_reorder back to its default (e.g., 0) so that
tidb_enable_outer_join_reorder, tidb_opt_enable_advanced_join_reorder, and
tidb_opt_greedy_join_reorder_start_by_cost are all restored at the end of the
test block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/planner/core/joinorder/join_order.go`:
- Around line 735-783: Limit the reseed scan to the connected component that
starts at nodes[start] by first computing the set of indices reachable from
start (using the same connectivity logic/detector used elsewhere) and then only
iterating leftIdx/rightIdx over that reachableIndices set instead of
nodes[start:]; update the nested loops that call checkConnectionAndMakeJoin to
use reachableIndices, so bestNode/bestLeftIdx/bestRightIdx only consider pairs
inside the current component, and defer considering cartesian joins
(applyCartesianFactor/cartesianFactor and allowNoEQ handling) until the global
final-stitch phase rather than during this component-local reseed.
- Around line 754-760: seedGreedyJoinComponentByCost already calls
applyCartesianFactor and embeds cartesianFactor into newNode.cumCost, so
betterGreedyCandidate must not multiply the penalty again; update the calls to
betterGreedyCandidate in seedGreedyJoinComponentByCost (and the other occurrence
around the 785-794 block) to pass a neutral factor (1.0) or otherwise indicate
"no extra cartesian penalty" when newNode.cumCost has already been adjusted,
leaving betterGreedyCandidate's logic unchanged; reference the symbols
newNode.cumCost, cartesianFactor, seedGreedyJoinComponentByCost, and
betterGreedyCandidate to locate and fix the double-application.

In `@pkg/planner/core/rule_join_reorder_greedy.go`:
- Around line 171-232: seedConnectedJoinTreeByCost currently scans all nodes in
s.curJoinGroup and can pick a seed from a different disconnected component;
restrict the seed search to only the nodes in the component being built (the
"current frontier") so later components cannot jump ahead. Concretely: before
the double loop in seedConnectedJoinTreeByCost, compute the set/slice of indices
that belong to the connected component rooted at the current frontier
(reuse/derive logic from constructConnectedJoinTree or use
checkConnectionAndMakeJoin connectivity checks) and then iterate
leftIdx/rightIdx only over that component subset of s.curJoinGroup; keep the
rest of the function (bestLeftIdx/bestRightIdx selection and collapsing into
remaining) but ensure you only collapse nodes from that same component and leave
other components untouched.

---

Nitpick comments:
In `@pkg/sessionctx/vardef/tidb_vars.go`:
- Around line 671-675: The comment for TiDBOptGreedyJoinReorderStartByCost is
ambiguous — replace the phrase "smallest base-node cumulative cost" with clearer
planner terminology such as "seeding from the smallest base relation (i.e., the
base-node with the smallest cumulative cost)" so it reads that the greedy join
reorder picks the first join in a connected component by the cheapest join cost
instead of seeding from the smallest base relation; update the comment above the
TiDBOptGreedyJoinReorderStartByCost constant accordingly to use this phrasing.

In `@pkg/sessionctx/variable/session.go`:
- Around line 1267-1270: The new session flag
TiDBOptGreedyJoinReorderStartByCost is currently left at Go's zero value; update
NewSessionVars to explicitly initialize this field using
vardef.DefTiDBOptGreedyJoinReorderStartByCost so it follows the pattern of
adjacent optimizer vars (e.g., set TiDBOptGreedyJoinReorderStartByCost:
vardef.DefTiDBOptGreedyJoinReorderStartByCost in the session vars constructor)
ensuring consistent defaults and future-proofing.

In `@tests/integrationtest/t/planner/core/join_reorder2.test`:
- Around line 100-114: Add a final session reset for
tidb_enable_outer_join_reorder to restore test isolation: after the existing
resets for tidb_opt_enable_advanced_join_reorder and
tidb_opt_greedy_join_reorder_start_by_cost, add a statement to set
@@tidb_enable_outer_join_reorder back to its default (e.g., 0) so that
tidb_enable_outer_join_reorder, tidb_opt_enable_advanced_join_reorder, and
tidb_opt_greedy_join_reorder_start_by_cost are all restored at the end of the
test block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 07c7ec83-15a7-4889-9e42-b1fc536af02b

📥 Commits

Reviewing files that changed from the base of the PR and between 212ded5 and 7942bbd.

📒 Files selected for processing (10)
  • pkg/planner/core/joinorder/join_order.go
  • pkg/planner/core/rule_join_reorder.go
  • pkg/planner/core/rule_join_reorder_greedy.go
  • pkg/sessionctx/vardef/tidb_vars.go
  • pkg/sessionctx/variable/session.go
  • pkg/sessionctx/variable/setvar_affect.go
  • pkg/sessionctx/variable/sysvar.go
  • pkg/sessionctx/variable/varsutil_test.go
  • tests/integrationtest/r/planner/core/join_reorder2.result
  • tests/integrationtest/t/planner/core/join_reorder2.test

Comment thread pkg/planner/core/joinorder/join_order.go Outdated
Comment thread pkg/planner/core/joinorder/join_order.go
Comment thread pkg/planner/core/rule_join_reorder_greedy.go
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 69.49686% with 97 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.9612%. Comparing base (5733f22) to head (456ee35).
⚠️ Report is 102 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #67784        +/-   ##
================================================
- Coverage   77.6020%   76.9612%   -0.6409%     
================================================
  Files          1981       1968        -13     
  Lines        548804     557716      +8912     
================================================
+ Hits         425883     429225      +3342     
- Misses       122111     127757      +5646     
+ Partials        810        734        -76     
Flag Coverage Δ
integration 40.9430% <69.4968%> (+6.6032%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 61.5065% <ø> (ø)
parser ∅ <ø> (∅)
br 48.8039% <ø> (-11.6033%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 15, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/planner/core/joinorder/join_order.go (1)

746-764: ⚠️ Potential issue | 🟠 Major

Skip pure cartesian candidates during component reseeding.

When allowNoEQ is true, this loop can still seed the component with a TryCreateCartesianCheckResult() pair that only shares the component through some third table. That makes the new switch choose an early cartesian seed, which is broader than the intended “best first real join inside this component” behavior.

Suggested fix
 			checkResult, newNode, err := checkConnectionAndMakeJoin(detector, nodes[leftIdx], nodes[rightIdx], vertexHints, allowNoEQ)
 			if err != nil {
 				return nil, false, err
 			}
 			if newNode == nil {
 				continue
 			}
+			if !checkResult.HasJoinCondition() {
+				continue
+			}
 			if checkResult.NoEQEdge() {
 				if !allowNoEQ {
 					continue
 				}
 				newNode.cumCost, err = applyCartesianFactor(newNode.cumCost, cartesianFactor)

Based on learnings, cartesian edges should only be considered after the entire join reorder is complete and there are still disjoint join groups that need to be connected, or when a leading hint explicitly requires a cartesian join.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/joinorder/join_order.go` around lines 746 - 764, The loop
currently accepts joins produced by TryCreateCartesianCheckResult() even when
allowNoEQ is true, which seeds components with pure-cartesian candidates; modify
the loop in join order seeding (around checkConnectionAndMakeJoin, checkResult,
newNode, allowNoEQ) to skip candidates that are pure-cartesian by detecting the
TryCreateCartesianCheckResult-origin (e.g., via a new predicate like
checkResult.IsPureCartesian() or an existing marker on checkResult) and continue
early when allowNoEQ is true unless a leading hint explicitly requires a
cartesian join or you are in the final reconnection phase where disjoint groups
must be connected; keep the existing cartesian cost application
(applyCartesianFactor) for cases that proceed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/planner/core/rule_join_reorder_greedy.go`:
- Around line 171-203: In seedConnectedJoinTreeByCost, the loop over
componentIdxs can pick a cartesian pair when cartesianThreshold > 0; change the
loop so that after calling checkConnectionAndMakeJoin(...) you immediately skip
any isCartesian candidate when cartesianThreshold > 0 (i.e., if isCartesian &&
cartesianThreshold > 0 { continue }) so cartesian edges are not used to seed the
legacy greedy search; place this check before recording the opt var and before
cost/stat derivation to avoid counting or logging cartesian pairs prematurely
(refs: seedConnectedJoinTreeByCost, collectConnectedSeedComponentIndices,
checkConnectionAndMakeJoin, betterLegacyGreedyCandidate).

---

Duplicate comments:
In `@pkg/planner/core/joinorder/join_order.go`:
- Around line 746-764: The loop currently accepts joins produced by
TryCreateCartesianCheckResult() even when allowNoEQ is true, which seeds
components with pure-cartesian candidates; modify the loop in join order seeding
(around checkConnectionAndMakeJoin, checkResult, newNode, allowNoEQ) to skip
candidates that are pure-cartesian by detecting the
TryCreateCartesianCheckResult-origin (e.g., via a new predicate like
checkResult.IsPureCartesian() or an existing marker on checkResult) and continue
early when allowNoEQ is true unless a leading hint explicitly requires a
cartesian join or you are in the final reconnection phase where disjoint groups
must be connected; keep the existing cartesian cost application
(applyCartesianFactor) for cases that proceed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7871d12d-a172-4732-b278-6750898c58fd

📥 Commits

Reviewing files that changed from the base of the PR and between 7942bbd and f14fbe2.

📒 Files selected for processing (5)
  • pkg/planner/core/joinorder/conflict_detector.go
  • pkg/planner/core/joinorder/join_order.go
  • pkg/planner/core/rule_join_reorder_greedy.go
  • tests/integrationtest/r/planner/core/join_reorder2.result
  • tests/integrationtest/t/planner/core/join_reorder2.test
✅ Files skipped from review due to trivial changes (1)
  • tests/integrationtest/t/planner/core/join_reorder2.test
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integrationtest/r/planner/core/join_reorder2.result

Comment thread pkg/planner/core/rule_join_reorder_greedy.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/integrationtest/t/planner/core/join_reorder2.test (1)

193-210: Good cartesian exclusion test, but cleanup may be incomplete.

The test at lines 193-207 properly verifies that reseeding won't start from a pure cartesian pair.

However, the cleanup at lines 208-210 resets tidb_opt_join_reorder_threshold, tidb_opt_cartesian_join_order_threshold, and tidb_opt_greedy_join_reorder_start_by_cost, but leaves tidb_enable_outer_join_reorder (set to 1 at line 100) and tidb_opt_enable_advanced_join_reorder (set to 1 at line 206) at non-default values. If subsequent tests in this file or session depend on default values, this could cause unexpected behavior.

♻️ Suggested cleanup additions
 set @@tidb_opt_join_reorder_threshold = 0;
 set @@tidb_opt_cartesian_join_order_threshold = 0;
 set @@tidb_opt_greedy_join_reorder_start_by_cost = 0;
+set @@tidb_enable_outer_join_reorder = default;
+set @@tidb_opt_enable_advanced_join_reorder = default;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integrationtest/t/planner/core/join_reorder2.test` around lines 193 -
210, The test resets three session variables but leaves
tidb_enable_outer_join_reorder and tidb_opt_enable_advanced_join_reorder
altered; update the cleanup at the end of this test (after the existing set ...
= 0 lines) to also restore tidb_enable_outer_join_reorder and
tidb_opt_enable_advanced_join_reorder to their defaults (rollback the change
made earlier where tidb_enable_outer_join_reorder was set to 1 and where
tidb_opt_enable_advanced_join_reorder was set to 1) so subsequent tests see
expected default settings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/integrationtest/t/planner/core/join_reorder2.test`:
- Around line 193-210: The test resets three session variables but leaves
tidb_enable_outer_join_reorder and tidb_opt_enable_advanced_join_reorder
altered; update the cleanup at the end of this test (after the existing set ...
= 0 lines) to also restore tidb_enable_outer_join_reorder and
tidb_opt_enable_advanced_join_reorder to their defaults (rollback the change
made earlier where tidb_enable_outer_join_reorder was set to 1 and where
tidb_opt_enable_advanced_join_reorder was set to 1) so subsequent tests see
expected default settings.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 588fd666-58f5-4e3f-a552-5567dae555b3

📥 Commits

Reviewing files that changed from the base of the PR and between f14fbe2 and a266663.

📒 Files selected for processing (4)
  • pkg/planner/core/joinorder/join_order.go
  • pkg/planner/core/rule_join_reorder_greedy.go
  • tests/integrationtest/r/planner/core/join_reorder2.result
  • tests/integrationtest/t/planner/core/join_reorder2.test
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integrationtest/r/planner/core/join_reorder2.result

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/planner/core/join_reorder_side_effect_test.go (1)

88-94: Keep this regression focused on immutability, not exact wrapper placement.

Lines 91-92 require both returned plans to differ from the original inputs. That is stronger than the bug contract and makes the test brittle against harmless cast-normalization changes that only need to wrap one side. The regression is already protected by asserting that projectionLeaf itself stays unchanged.

Suggested simplification
 	newLeft, newRight, usedEdges, joinType := solver.checkConnection(intLeaf, projectionLeaf)
 	require.Len(t, usedEdges, 1)
 	require.Equal(t, base.InnerJoin, joinType.JoinType)
-	require.NotSame(t, intLeaf, newLeft)
-	require.NotSame(t, projectionLeaf, newRight)
+	require.NotNil(t, newLeft)
+	require.NotNil(t, newRight)
 	require.Len(t, projectionLeaf.Exprs, origExprCnt)
 	require.Equal(t, origSchemaLen, projectionLeaf.Schema().Len())

As per coding guidelines, code SHOULD remain maintainable for future readers with basic TiDB familiarity, including readers who are not experts in the specific subsystem/feature.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/join_reorder_side_effect_test.go` around lines 88 - 94, The
test currently asserts both returned plans differ from the inputs
(require.NotSame on newLeft and newRight) which is too strict; update the test
in join_reorder_side_effect_test.go to only require the input side that must be
wrapped (keep require.NotSame for newLeft vs intLeaf if desired) and remove the
require.NotSame assertion for projectionLeaf/newRight so the regression focuses
on immutability; retain the existing assertions that projectionLeaf.Exprs and
projectionLeaf.Schema() remain unchanged after calling
solver.checkConnection(intLeaf, projectionLeaf).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/planner/core/operator/logicalop/logical_projection.go`:
- Around line 695-703: InjectExprAvoidingMutation shallow-copies a
LogicalProjection via proj.LogicalProjectionShallowRef() but leaves any cached
StatsInfo() intact, so after proj.AppendExpr(expr) downstream DeriveStats() can
reuse stale stats and skip NDV recomputation; fix by invalidating the cloned
projection's stats cache (clear or nil-out its StatsInfo) immediately after
proj.LogicalProjectionShallowRef() and before proj.AppendExpr(expr) (i.e., in
InjectExprAvoidingMutation, after calling proj.LogicalProjectionShallowRef() and
before proj.SetSchema/AppendExpr) so NDV will be recomputed correctly during
DeriveStats().

---

Nitpick comments:
In `@pkg/planner/core/join_reorder_side_effect_test.go`:
- Around line 88-94: The test currently asserts both returned plans differ from
the inputs (require.NotSame on newLeft and newRight) which is too strict; update
the test in join_reorder_side_effect_test.go to only require the input side that
must be wrapped (keep require.NotSame for newLeft vs intLeaf if desired) and
remove the require.NotSame assertion for projectionLeaf/newRight so the
regression focuses on immutability; retain the existing assertions that
projectionLeaf.Exprs and projectionLeaf.Schema() remain unchanged after calling
solver.checkConnection(intLeaf, projectionLeaf).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b99a1893-1cd0-440d-a47c-30effb7dccae

📥 Commits

Reviewing files that changed from the base of the PR and between a266663 and 5bb5590.

📒 Files selected for processing (9)
  • pkg/planner/core/BUILD.bazel
  • pkg/planner/core/join_reorder_side_effect_test.go
  • pkg/planner/core/joinorder/BUILD.bazel
  • pkg/planner/core/joinorder/conflict_detector.go
  • pkg/planner/core/joinorder/join_order_side_effect_test.go
  • pkg/planner/core/operator/logicalop/logical_projection.go
  • pkg/planner/core/rule_join_reorder.go
  • tests/integrationtest/r/planner/core/join_reorder2.result
  • tests/integrationtest/t/planner/core/join_reorder2.test
✅ Files skipped from review due to trivial changes (2)
  • pkg/planner/core/BUILD.bazel
  • tests/integrationtest/t/planner/core/join_reorder2.test
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/planner/core/joinorder/conflict_detector.go

Comment thread pkg/planner/core/operator/logicalop/logical_projection.go
@Reminiscent Reminiscent changed the title planner: add opt-in greedy join reorder start-by-cost switch planner: add opt-in greedy join reorder seed-by-cost switch Apr 23, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 23, 2026

@Reminiscent: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/mysql-test 456ee35 link true /test mysql-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

planner: greedy join reorder may seed outer-join groups from the smallest table instead of the cheapest first join

1 participant