planner: support using nested `IN` to build IndexMerge path (#68962) by ti-chi-bot · Pull Request #69003 · pingcap/tidb

ti-chi-bot · 2026-06-06T12:52:08Z

This is an automated cherry-pick of #68962

What problem does this PR solve?

Issue Number: ref #65822

Problem Summary:

For queries like SELECT * FROM t1 WHERE e = 1 AND (a IN (1,2,3) OR b IN (2,3,4) OR c IN (3,4,5)), TiDB previously could not build an IndexMerge path when there are IN expressions in the nested OR list. The query would fall back to a plain IndexLookUp with a residual Selection, which is much less efficient.

This is the first optimization described in the issue.

What changed and how does it work?

planner
- checkAccessFilter4IdxCol() (pkg/planner/core/indexmerge_path.go): Add support for ast.In expressions in the non-virtual column branch. Previously only ast.EQ was recognized, so IN expressions like a IN (1,2,3) could not be collected as partial access filters in the "gradual collection" path (case 3 in initUnfinishedPathsFromExpr()). Now they are collected and later combined with top-level AND conditions (e.g., e = 1) by handleTopLevelANDList() to build valid ranges for composite indexes.
- Rename eqOnNonMVColTp to eqOrInOnNonMVColTp to reflect that it now covers both EQ and IN expressions.

After this fix, the plan becomes:

IndexMerge
├─IndexRangeScan  index:iea(e, a)  range:[1 1,1 1], [1 2,1 2], [1 3,1 3]
├─IndexRangeScan  index:ieb(e, b)  range:[1 2,1 2], [1 3,1 3], [1 4,1 4]
├─IndexRangeScan  index:iec(e, c)  range:[1 3,1 3], [1 4,1 4], [1 5,1 5]
└─TableRowIDScan(Probe)

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Summary by CodeRabbit

Performance Improvements
- Enhanced query optimization for statements using IN predicates with index merging, enabling more efficient index selection for complex query patterns.
Tests
- Added comprehensive test coverage for advanced index merge scenarios with IN expressions.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot · 2026-06-06T12:52:12Z

This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.

To merge this cherry pick:

It must be LGTMed and approved by the reviewers firstly.
For pull requests to TiDB-x branches, it must have no failed tests.
AFTER it has lgtm and approved labels, please wait for the cherry-pick merging approval from triage owners.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot · 2026-06-06T12:52:12Z

@time-and-fate This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

ti-chi-bot · 2026-06-06T12:52:14Z

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2026-06-06T12:52:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign qw4990 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS
pkg/planner/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-06-06T12:52:34Z

📝 Walkthrough

Walkthrough

This PR extends TiDB's IndexMerge planner to support IN predicates with constant values as access filters on non-MV columns, treating them equivalently to equality predicates. The change introduces a new combined access-filter type, updates predicate validation logic, modifies filter collection in unfinished-path construction, and adds comprehensive test coverage.

Changes

IndexMerge IN Predicate Support

Layer / File(s)	Summary
Combined eqOrIn access-filter type definition and validation `pkg/planner/core/indexmerge_path.go`	New `eqOrInOnNonMVColTp` constant represents access predicates that are either `=` or `IN (...)` on non-MV columns. `checkAccessFilter4IdxCol` now validates `IN` expressions where the left argument is the target column and all right arguments are constants, returning the combined type. Both `EQ` and `IN` cases return `eqOrInOnNonMVColTp`. `CollectFilters4MVIndex` propagates this combined type when overwriting `accessTp`.
Filter collection in unfinished-path builder `pkg/planner/core/indexmerge_unfinished_path.go`	`initUnfinishedPathsFromExpr` updated to accept `eqOrInOnNonMVColTp` predicates and route qualifying CNF items into `usableFilters` with `idxColHasUsableFilter` tracking instead of the prior `accessFilters`/`idxColHasAccessFilter` behavior.
Test cases and expected outputs `tests/integrationtest/t/planner/core/indexmerge_path.test`, `tests/integrationtest/r/planner/core/indexmerge_path.result`, `tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result`	New test cases cover Issue `#58361` (AND+OR filter combinations), MV-index OR with ORDER BY/LIMIT, sort-item hint interactions, and nested `IN` expressions within OR lists (Issue `#68962`). Expected outputs for `c in (1, 2, 3)` queries with `ORDER BY` and `LIMIT` updated to plan-tree format, showing detailed operator structure including `IndexMerge`, `IndexRangeScan`, `Selection`, and `TopN` nodes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

pingcap/tidb#68753: Directly aligned with this PR's IN predicate handling for IndexMerge merge-sort and order-by partial paths.

Suggested labels

size/M, cherry-pick-approved, ok-to-test

Suggested reviewers

terry1purcell
qw4990

Poem

🐰 Nested IN predicates now bloom,
IndexMerge plans light up the room,
Constants validate with care,
Filter types beyond compare,
Access paths take to the air!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The PR description is mostly complete, covering the problem statement, solution details, and test coverage. However, it has merge-conflict markers present in the test result files mentioned in the changeset.	Resolve the merge-conflict markers found in test files (<<<<<<< HEAD, >>>>>>> markers in physical_plan.result and indexmerge_path.result/test) before merging to ensure clean test outputs.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically describes the main change: adding support for nested IN expressions in IndexMerge path planning, directly addressing issue `#68962`.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tiprow · 2026-06-06T12:55:24Z

@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
fast_test_tiprow_for_release	`05b393b`	link	true	`/test fast_test_tiprow_for_release`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result`:
- Around line 3664-3684: The test result file physical_plan.result contains
unresolved Git conflict markers mixing 'brief' and 'plan_tree' outputs; fix by
removing the conflict markers around the two explain blocks and ensure the
inputs in physical_plan.test match the intended "explain format = 'plan_tree'
select ... order by c limit 2" and "explain format = 'plan_tree' select ...
order by b limit 2" statements, then regenerate the golden output so the
plan_tree outputs in physical_plan.result correspond exactly to those two
explain queries (verify the IndexMerge/IndexRangeScan/TopN blocks for each query
are present and consistent).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: fabe6aa3-8ef1-49e0-a199-3213811217ed

📥 Commits

Reviewing files that changed from the base of the PR and between de0a5bc and 05b393b.

📒 Files selected for processing (5)

pkg/planner/core/indexmerge_path.go
pkg/planner/core/indexmerge_unfinished_path.go
tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result
tests/integrationtest/r/planner/core/indexmerge_path.result
tests/integrationtest/t/planner/core/indexmerge_path.test

coderabbitai · 2026-06-06T12:57:26Z

+explain format = 'plan_tree' select * from t where (a = 1 or b = 2) and c in (1, 2, 3) order by c limit 2;
+id	task	access object	operator info
+Limit	root		offset:0, count:2
+└─Projection	root		planner__core__casetest__physicalplantest__physical_plan.t.a, planner__core__casetest__physicalplantest__physical_plan.t.b, planner__core__casetest__physicalplantest__physical_plan.t.c
+  └─IndexMerge	root		type: union
+    ├─IndexRangeScan(Build)	cop[tikv]	table:t, index:idx(a, c)	range:[1 1,1 1], [1 2,1 2], [1 3,1 3], keep order:true, stats:pseudo
+    ├─IndexRangeScan(Build)	cop[tikv]	table:t, index:idx2(b, c)	range:[2 1,2 1], [2 2,2 2], [2 3,2 3], keep order:true, stats:pseudo
+    └─Selection(Probe)	cop[tikv]		in(planner__core__casetest__physicalplantest__physical_plan.t.c, 1, 2, 3)
+      └─TableRowIDScan	cop[tikv]	table:t	keep order:false, stats:pseudo
+show warnings;
+Level	Code	Message
+explain format = 'plan_tree' select * from t where (a = 1 or b = 2) and c in (1, 2, 3) order by b limit 2;
+id	task	access object	operator info
+TopN	root		planner__core__casetest__physicalplantest__physical_plan.t.b, offset:0, count:2
+└─IndexMerge	root		type: union
+  ├─IndexRangeScan(Build)	cop[tikv]	table:t, index:idx(a, c)	range:[1 1,1 1], [1 2,1 2], [1 3,1 3], keep order:false, stats:pseudo
+  ├─IndexRangeScan(Build)	cop[tikv]	table:t, index:idx2(b, c)	range:[2 1,2 1], [2 2,2 2], [2 3,2 3], keep order:false, stats:pseudo
+  └─TopN(Probe)	cop[tikv]		planner__core__casetest__physicalplantest__physical_plan.t.b, offset:0, count:2
+    └─Selection	cop[tikv]		in(planner__core__casetest__physicalplantest__physical_plan.t.c, 1, 2, 3)
+      └─TableRowIDScan	cop[tikv]	table:t	keep order:false, stats:pseudo
+>>>>>>> d568a8528e9 (planner: support using nested `IN` to build IndexMerge path (#68962))


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Locate paired integration test files fd -i 'physical_plan.test' tests/integrationtest/t fd -i 'physical_plan.result' tests/integrationtest/r # 2) Ensure these two queries are now plan_tree in test input rg -nF "explain format = 'plan_tree' select * from t where (a = 1 or b = 2) and c in (1, 2, 3) order by c limit 2;" tests/integrationtest/t rg -nF "explain format = 'plan_tree' select * from t where (a = 1 or b = 2) and c in (1, 2, 3) order by b limit 2;" tests/integrationtest/t # 3) Sanity check old brief-format variants are not still present for the same two queries if rg -nF "explain format = 'brief' select * from t where (a = 1 or b = 2) and c in (1, 2, 3) order by c limit 2;" tests/integrationtest/t; then echo "Found old brief-format c-order query in .test; please confirm whether dual coverage is intentional." fi if rg -nF "explain format = 'brief' select * from t where (a = 1 or b = 2) and c in (1, 2, 3) order by b limit 2;" tests/integrationtest/t; then echo "Found old brief-format b-order query in .test; please confirm whether dual coverage is intentional." fi

Repository: pingcap/tidb

Length of output: 336

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Show the exact input/output context around the reported result lines echo "=== physicalplantest/physical_plan.result: lines 3630-3720 ===" sed -n '3630,3720p' tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result | nl -ba | sed -n '1,200p' # 2) Search the corresponding .test file for the unique WHERE clause and order-by/limit substrings echo "=== Search in tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test ===" rg -n --fixed-string "(a = 1 or b = 2) and c in (1, 2, 3)" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true rg -n --fixed-string "order by c limit 2" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true rg -n --fixed-string "order by b limit 2" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true # 3) Check whether brief-format variants exist near those lines echo "=== Search brief format occurrences in that .test file ===" rg -n "explain format = 'brief'" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test | head -n 50 || true # Narrow: search for brief format that also contains the WHERE clause echo "=== brief-format occurrences with the WHERE clause ===" rg -n "explain format = 'brief'.{0,200}$a = 1 or b = 2$.*c in $1, 2, 3$" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true

Repository: pingcap/tidb

Length of output: 180

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show context around the reported result lines with line numbers via awk echo "=== physicalplantest/physical_plan.result: lines 3630-3720 ===" sed -n '3630,3720p' tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result \ | awk '{printf "%5d\t%s\n", (3630+NR-1), $0}' echo "=== Search in tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test ===" rg -n --fixed-string "(a = 1 or b = 2) and c in (1, 2, 3)" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true rg -n --fixed-string "order by c limit 2" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true rg -n --fixed-string "order by b limit 2" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true echo "=== brief format occurrences in that .test file ===" rg -n "explain format = 'brief'" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test | head -n 200 || true echo "=== brief occurrences with the WHERE clause (narrow) ===" rg -n "explain format = 'brief'.{0,300}$a = 1 or b = 2$.*c in $1, 2, 3$" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true echo "=== plan_tree occurrences with the WHERE clause (narrow) ===" rg -n "explain format = 'plan_tree'.{0,300}$a = 1 or b = 2$.*c in $1, 2, 3$" tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test || true

Repository: pingcap/tidb

Length of output: 36093

Fix unresolved conflict markers and align plan_tree inputs/golden outputs

tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result still contains merge conflict markers (<<<<<<< HEAD / ======= / >>>>>>> ...) around this block, mixing brief and plan_tree outputs (e.g., plan_tree queries start at explain format = 'plan_tree' ...).

The paired input tests/integrationtest/t/planner/core/casetest/physicalplantest/physical_plan.test still contains only the explain format = 'brief' ... order by c/b limit 2 statements for the same queries; no plan_tree counterparts were found.

Remove the conflict markers, update the .test statements to the intended plan_tree form (for both queries), and regenerate the corresponding .result.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result` around lines 3664 - 3684, The test result file physical_plan.result contains unresolved Git conflict markers mixing 'brief' and 'plan_tree' outputs; fix by removing the conflict markers around the two explain blocks and ensure the inputs in physical_plan.test match the intended "explain format = 'plan_tree' select ... order by c limit 2" and "explain format = 'plan_tree' select ... order by b limit 2" statements, then regenerate the golden output so the plan_tree outputs in physical_plan.result correspond exactly to those two explain queries (verify the IndexMerge/IndexRangeScan/TopN blocks for each query are present and consistent).

Source: Learnings

ti-chi-bot · 2026-06-06T12:59:37Z

@ti-chi-bot: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
idc-jenkins-ci-tidb/check_dev_2	`05b393b`	link	true	`/test check-dev2`
idc-jenkins-ci-tidb/check_dev	`05b393b`	link	true	`/test check-dev`
idc-jenkins-ci-tidb/mysql-test	`05b393b`	link	true	`/test mysql-test`
idc-jenkins-ci-tidb/unit-test	`05b393b`	link	true	`/test unit-test`
idc-jenkins-ci-tidb/build	`05b393b`	link	true	`/test build`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

This is an automated cherry-pick of pingcap#68962

05b393b

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot mentioned this pull request Jun 6, 2026

planner: support using nested IN to build IndexMerge path #68962

Merged

13 tasks

ti-chi-bot Bot added the do-not-merge/cherry-pick-not-approved label Jun 6, 2026

ti-chi-bot assigned time-and-fate Jun 6, 2026

ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 6, 2026

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

Conversation

ti-chi-bot commented Jun 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot Bot commented Jun 6, 2026

Uh oh!

ti-chi-bot commented Jun 6, 2026

Uh oh!

ti-chi-bot Bot commented Jun 6, 2026

Uh oh!

ti-chi-bot Bot commented Jun 6, 2026

Uh oh!

coderabbitai Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

tiprow Bot commented Jun 6, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot Bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ti-chi-bot commented Jun 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 6, 2026 •

edited

Loading