planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)#68743
planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)#68743ti-chi-bot wants to merge 1 commit into
Conversation
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
@terry1purcell This PR has conflicts, I have hold it. |
|
@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis PR implements full-text search (FTS) query support in TiDB with automatic fallback to ILIKE when native FTS infrastructure is unavailable. It adds MATCH...AGAINST builtin functions, expression rewriting logic to convert FTS to ILIKE predicates under a strict subset, and a multi-round optimizer that can rebuild logical plans to try FTS fallback when native execution isn't viable. ChangesFTS Query Rewriting and Alternative Logical Planning
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@ti-chi-bot: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@ti-chi-bot: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Actionable comments posted: 16
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.gitignore:
- Around line 40-67: The .gitignore contains leftover merge conflict markers
(<<<<<<< HEAD, =======, >>>>>>>) and both sets of entries; remove the conflict
markers and merge the two sides into the intended final ignore list by choosing
or combining the desired lines (e.g., keep the consolidated ignores such as
*.log.json, genkeyword, test_coverage, coverage.dat plus the integration tests,
local dev artifacts, personal config files, and .claude entries) so the file is
clean and contains only valid ignore patterns with no conflict markers.
In `@pkg/expression/BUILD.bazel`:
- Around line 63-67: The BUILD.bazel srcs block contains unresolved cherry-pick
conflict markers (<<<<<<< HEAD / ======= / >>>>>>>) and is missing
builtin_fts.go which will break compilation; remove the conflict markers from
the go_library.srcs list, ensure the entries include "fts_helper.go",
"fts_to_like.go", and add "builtin_fts.go" to the srcs for the pkg/expression
go_library so Bazel can parse the BUILD file and compile the package.
In `@pkg/expression/builtin.go`:
- Around line 972-978: There are leftover git conflict markers in the funcs
registry; remove the markers (<<<<<<< HEAD and >>>>>>> ...) and ensure the two
entries for ast.FTSMatchWord -> ftsMatchWordFunctionClass and
ast.FTSMysqlMatchAgainst -> ftsMysqlMatchAgainstFunctionClass remain correctly
placed in the funcs map with proper commas and no conflict text so
pkg/expression builds cleanly.
In `@pkg/expression/distsql_builtin.go`:
- Around line 1152-1163: Remove the leftover git conflict markers (<<<<<<<,
=======, >>>>>>>) in the DistSQL scalar function switch and restore the two FTS
cases so the switch compiles: include the case tipb.ScalarFuncSig_FTSMatchWord
with f = &builtinFtsMatchWordSig{base} and the case
tipb.ScalarFuncSig_FTSMatchExpression with f =
&builtinFtsMysqlMatchAgainstSig{baseBuiltinFunc: base} (or match the surrounding
cases' struct field style), ensuring there are no conflict markers left and the
switch entries follow the existing switch syntax.
In `@pkg/expression/fts_to_like.go`:
- Around line 67-73: isFTSWordByte wrongly treats any byte >127 as a word
character; change the logic to perform rune-level validation using unicode
properties (e.g., unicode.IsLetter/unicode.IsDigit) or the repo's tokenizer
utility instead of byte checks. Replace isFTSWordByte with a rune-based checker
(or add isFTSRune) and update callers such as
ValidateFTSSearchStringForLikeFallback to iterate over runes (not bytes) and
call the new rune validator so non-ASCII punctuation/symbols are not treated as
word characters.
In `@pkg/expression/function_traits_test.go`:
- Around line 28-312: Remove the leftover git conflict markers (<<<<<<<,
=======, >>>>>>>) that wrap the TestIllegalFunctions4GeneratedColumns block so
the file is valid Go; ensure only one coherent copy of the
TestIllegalFunctions4GeneratedColumns function remains (keeping the intended
knownGood list and the loop that computes legal using GetBuiltinList() and
IllegalFunctions4GeneratedColumns), delete the conflict separators and
duplicated code, and run go test to confirm the package compiles.
In `@pkg/expression/infer_pushdown.go`:
- Around line 416-432: Remove the leftover merge conflict markers (<<<<<<<,
=======, >>>>>>>) and merge in the new FTS cases so the switch returns the
intended values: add the case ast.FTSMatchWord to return true and the case
ast.FTSMysqlMatchAgainst to extract
function.Function.(*builtinFtsMysqlMatchAgainstSig) and return ok &&
!sig.modifier.IsBooleanMode() && !sig.modifier.WithQueryExpansion(); ensure the
code compiles and keep the comment about TiFlash/modifier behavior and the
reference to matchAgainstToBuiltin intact.
In `@pkg/expression/integration_test/integration_test.go`:
- Around line 65-267: There is an unresolved Git conflict block (<<<<<<< HEAD /
======= / >>>>>>>) in the integration_test.go file starting near the
TestFTSParser/TestFTSSyntax/TestFTSIndexSyntax tests; remove the conflict
markers and merge the two sides into a single valid Go source so the package
builds (ensure the TestFTSParser, TestFTSSyntax and TestFTSIndexSyntax functions
remain intact and any duplicated or commented-out sections are resolved into the
intended final test code).
In `@pkg/parser/ast/functions.go`:
- Around line 369-375: The snippet contains unresolved Git conflict markers
(<<<<<<<, =======, >>>>>>>) around the FTS constants which prevents compilation;
remove the conflict markers and merge the branches so the FTS constants are
defined cleanly (e.g., ensure FTSMatchWord and FTSMysqlMatchAgainst are present
in the same const block within the functions.go file and no conflict markers
remain); verify the const block compiles and run a build to confirm the parser
package compiles.
In `@pkg/planner/core/BUILD.bazel`:
- Around line 18-23: Resolve the Git conflict markers in BUILD.bazel by removing
the <<<<<<<, =======, and >>>>>>> lines and producing a correct srcs list that
contains all intended source files (ensure entries like "foreign_key.go",
"fragment.go", "fragment_test.go" and "fulltext_to_like.go" are present if they
belong to this target); update the same resolution for the duplicate conflict at
the other hunk (around the referenced 231-235 area). Ensure the final target's
srcs is a valid comma-separated list of string literals with no conflict markers
so Bazel can parse the file.
In `@pkg/planner/core/expression_rewriter.go`:
- Around line 680-769: Remove the leftover conflict markers and restore the
three functions (canTreatInSubqueryAsExistsForFilter,
inDirectMatchBooleanContext, matchHasLikeFallbackRescue) exactly as intended,
then wire up AST ancestry plumbing so er.astNodeStack exists and is maintained:
add a stack field to expressionRewriter (e.g. astNodeStack []ast.Node) or reuse
an existing ancestor stack on planCtx, and update expressionRewriter.Enter and
Leave to push the visited ast.Node onto astNodeStack on Enter and pop on Leave.
Ensure the functions reference the correct clause via planCtx.builder.curClause
(or planCtx.curClause if that is the intended field) and guard nil
planCtx/builder as in the diff so the file compiles.
In `@pkg/planner/core/planbuilder.go`:
- Around line 322-371: The file contains unresolved Git conflict markers in the
PlanBuilder type area; remove the conflict markers (<<<<<<<, =======, >>>>>>>)
and reconcile the two competing blocks by merging the SavedViews field with the
new FTS-related fields (nonViableFTSMatch, predicateMatchSeen) into a single
PlanBuilder struct, and keep the accompanying accessor/mutator methods
(HasNonViableFTSMatch, MarkNonViableFTSMatch, HasPredicateMatch,
MarkPredicateMatch) only once; ensure PlanBuilder now declares SavedViews
[]*ast.TableName plus the two boolean fields and that the added methods
reference that single struct definition so the file compiles.
In `@pkg/planner/optimize.go`:
- Around line 469-703: The file contains unresolved git conflict markers
(<<<<<<<, =======, >>>>>>>) leaving a half-merged optimizer refactor; remove the
conflict markers and produce a single coherent version by keeping the new
logicalPlanBuildCtx, saveLogicalPlanBuildCtx, restoreLogicalPlanBuildCtx and
buildAndOptimizeLogicalPlanRound implementations (and only one declaration of
optimizeCnt), deleting the duplicate old fragment, and ensure references to
stmtctx, rule and any new symbols are correctly imported/used; verify
alternativeRounds and related helpers (shouldTryNonDecorrelationRound,
shouldTryOrderAwareReorderRound, shouldTryCorrelateRound,
savedEnableCorrelateSubquery, savedFTSLikeFallback) are present and consistent
with the rest of the file so the file compiles.
- Around line 639-699: The two package-level flags savedEnableCorrelateSubquery
and savedFTSLikeFallback are unsafe because they are shared across sessions;
make the saved state local to each optimize invocation by removing those globals
and storing the saved values per-round instead (either add fields like
savedEnableCorrelateSubquery/savedFTSLikeFallback to the alternativeRound struct
or rebuild alternativeRounds inside optimize() so each round’s setup/cleanup
closures capture local variables). Update the correlate round’s setup/cleanup
and the fts-like-fallback round’s setup/cleanup to read/write the saved value
from the per-round storage (or captured local) and ensure optimize() uses the
per-invocation alternativeRounds so concurrent Optimize calls don’t overwrite
each other.
In `@pkg/sessionctx/stmtctx/stmtctx.go`:
- Around line 449-486: The file contains unresolved git conflict markers
(<<<<<<<, =======, >>>>>>>) around the StatementContext additions, leaving the
struct half-merged; remove the conflict markers and ensure the full set of new
fields (AlternativeLogicalPlanDecorrelatedApply,
AlternativeLogicalPlanSameOrderIndexJoin,
AlternativeLogicalPlanOrderAwareJoinReorder,
AlternativeLogicalPlanPreferCorrelate, AlternativeLogicalPlanFTSLikeFallback,
AlternativeLogicalPlanHasPredicateContextMatch) are present exactly once in the
StatementContext definition (and any related setup/cleanup code blocks), delete
any duplicate or partial blocks from the other branch, and run go build/go vet
to verify the file parses cleanly.
In `@tests/integrationtest/r/executor/show.result`:
- Around line 759-763: The snapshot contains unresolved merge markers between
the builtin names "master_pos_wait" and "match_against"; open the expected
result file (tests/integrationtest/r/executor/show.result), remove the conflict
markers and choose the correct builtin entry (either keep "master_pos_wait" or
"match_against" as appropriate for the current codebase), then regenerate and
re-run the integration test to verify the SHOW BUILTINS golden output is updated
and committed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 899a881a-980a-4b99-8fdd-a1e3422bd6f1
📒 Files selected for processing (24)
.gitignorepkg/expression/BUILD.bazelpkg/expression/builtin.gopkg/expression/builtin_fts.gopkg/expression/builtin_threadunsafe_generated.gopkg/expression/distsql_builtin.gopkg/expression/fts_to_like.gopkg/expression/fts_to_like_test.gopkg/expression/function_traits_test.gopkg/expression/infer_pushdown.gopkg/expression/integration_test/integration_test.gopkg/parser/ast/functions.gopkg/planner/cardinality/selectivity.gopkg/planner/core/BUILD.bazelpkg/planner/core/expression_rewriter.gopkg/planner/core/fulltext_to_like.gopkg/planner/core/fulltext_to_like_test.gopkg/planner/core/planbuilder.gopkg/planner/optimize.gopkg/planner/util/null_misc_test.gopkg/sessionctx/stmtctx/stmtctx.gotests/integrationtest/r/executor/show.resulttests/integrationtest/r/planner/core/fulltext_search.resulttests/integrationtest/t/planner/core/fulltext_search.test
| <<<<<<< HEAD | ||
| *.log.json | ||
| genkeyword | ||
| test_coverage | ||
| coverage.dat | ||
| ======= | ||
|
|
||
| # Integration tests | ||
| tests/integrationtest/integration-test.out | ||
| tests/integrationtest/integrationtest_tidb-server | ||
| tests/integrationtest/s/ | ||
| tests/integrationtest/replayer/ | ||
|
|
||
| # Local dev artifacts | ||
| bench_daily.json | ||
| compose-dev.yaml | ||
| fix.sql | ||
| export-20*/ | ||
| var | ||
|
|
||
| # Personal config files | ||
| /*config.toml | ||
| .cache | ||
|
|
||
| # Claude Code runtime state (per-user, not part of repo) | ||
| .claude/scheduled_tasks.lock | ||
| .claude/settings.local.json | ||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Resolve the leftover cherry-pick conflict in .gitignore.
This hunk still contains Git conflict markers and both sides of the merge. Please collapse it to the intended final ignore list before merging; otherwise the file stays corrupted and the added ignore rules are not reviewable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.gitignore around lines 40 - 67, The .gitignore contains leftover merge
conflict markers (<<<<<<< HEAD, =======, >>>>>>>) and both sets of entries;
remove the conflict markers and merge the two sides into the intended final
ignore list by choosing or combining the desired lines (e.g., keep the
consolidated ignores such as *.log.json, genkeyword, test_coverage, coverage.dat
plus the integration tests, local dev artifacts, personal config files, and
.claude entries) so the file is clean and contains only valid ignore patterns
with no conflict markers.
| <<<<<<< HEAD | ||
| ======= | ||
| "fts_helper.go", | ||
| "fts_to_like.go", | ||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Fix this Bazel srcs block before merge.
This hunk still has cherry-pick markers, and builtin_fts.go is also missing from go_library.srcs. As written, Bazel will either fail to parse the BUILD file or fail to compile pkg/expression once the FTS registrations reference the new types.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/expression/BUILD.bazel` around lines 63 - 67, The BUILD.bazel srcs block
contains unresolved cherry-pick conflict markers (<<<<<<< HEAD / ======= /
>>>>>>>) and is missing builtin_fts.go which will break compilation; remove the
conflict markers from the go_library.srcs list, ensure the entries include
"fts_helper.go", "fts_to_like.go", and add "builtin_fts.go" to the srcs for the
pkg/expression go_library so Bazel can parse the BUILD file and compile the
package.
| <<<<<<< HEAD | ||
| ======= | ||
| // fts functions | ||
| ast.FTSMatchWord: &ftsMatchWordFunctionClass{baseFunctionClass{ast.FTSMatchWord, 2, 2}}, | ||
| ast.FTSMysqlMatchAgainst: &ftsMysqlMatchAgainstFunctionClass{baseFunctionClass{ast.FTSMysqlMatchAgainst, 2, -1}}, | ||
|
|
||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Remove the unresolved conflict markers in the builtin registry.
The funcs map still includes cherry-pick markers here, which makes pkg/expression fail to build.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/expression/builtin.go` around lines 972 - 978, There are leftover git
conflict markers in the funcs registry; remove the markers (<<<<<<< HEAD and
>>>>>>> ...) and ensure the two entries for ast.FTSMatchWord ->
ftsMatchWordFunctionClass and ast.FTSMysqlMatchAgainst ->
ftsMysqlMatchAgainstFunctionClass remain correctly placed in the funcs map with
proper commas and no conflict text so pkg/expression builds cleanly.
| <<<<<<< HEAD | ||
| ======= | ||
| case tipb.ScalarFuncSig_FTSMatchWord: | ||
| f = &builtinFtsMatchWordSig{base} | ||
| case tipb.ScalarFuncSig_FTSMatchExpression: | ||
| // NOTE: builtinFtsMysqlMatchAgainstSig.modifier is not serialized in the | ||
| // protobuf encoding because the tipb schema has no FTS metadata message. | ||
| // The reconstructed sig therefore uses the zero modifier value | ||
| // (FulltextSearchModifierNaturalLanguageMode). TiFlash must derive the | ||
| // search mode from other context when executing this expression. | ||
| f = &builtinFtsMysqlMatchAgainstSig{baseBuiltinFunc: base} | ||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Clean up the cherry-pick markers in the DistSQL signature switch.
The unresolved <<<<<<< / ======= / >>>>>>> text leaves this switch syntactically invalid and blocks compilation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/expression/distsql_builtin.go` around lines 1152 - 1163, Remove the
leftover git conflict markers (<<<<<<<, =======, >>>>>>>) in the DistSQL scalar
function switch and restore the two FTS cases so the switch compiles: include
the case tipb.ScalarFuncSig_FTSMatchWord with f = &builtinFtsMatchWordSig{base}
and the case tipb.ScalarFuncSig_FTSMatchExpression with f =
&builtinFtsMysqlMatchAgainstSig{baseBuiltinFunc: base} (or match the surrounding
cases' struct field style), ensuring there are no conflict markers left and the
switch entries follow the existing switch syntax.
| // isFTSWordByte returns true for alphanumeric ASCII and non-ASCII bytes. | ||
| // Punctuation including underscore is NOT a word character, consistent with | ||
| // MySQL's built-in FTS tokenizer which treats _ as a word separator. Used by | ||
| // ValidateFTSSearchStringForLikeFallback to gate the LIKE rewrite. | ||
| func isFTSWordByte(c byte) bool { | ||
| return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9') || c > 127 | ||
| } |
There was a problem hiding this comment.
Tighten the non-ASCII token validation.
isFTSWordByte currently accepts any byte > 127, so terms containing non-ASCII punctuation/symbols will pass the “strict subset” gate and get rewritten to ILIKE. For example, a token with full-width punctuation is not alphanumeric, but this validator will still accept it byte-by-byte. That widens the fallback beyond the PR contract and can produce incorrect rewrites instead of leaving the query on the native/error path.
Use rune-level validation for letters/digits (or the repo’s tokenizer equivalent) instead of treating every non-ASCII byte as a word character.
Also applies to: 119-141
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/expression/fts_to_like.go` around lines 67 - 73, isFTSWordByte wrongly
treats any byte >127 as a word character; change the logic to perform rune-level
validation using unicode properties (e.g., unicode.IsLetter/unicode.IsDigit) or
the repo's tokenizer utility instead of byte checks. Replace isFTSWordByte with
a rune-based checker (or add isFTSRune) and update callers such as
ValidateFTSSearchStringForLikeFallback to iterate over runes (not bytes) and
call the new rune validator so non-ASCII punctuation/symbols are not treated as
word characters.
| <<<<<<< HEAD | ||
| // SavedViews is a stack that saves all views when traversing the AST. We depend on it to: | ||
| // 1. know whether the AST node is under a view | ||
| // 2. report precise error in appendColNamesToVisitInfo. | ||
| SavedViews []*ast.TableName | ||
| ======= | ||
| // nonViableFTSMatch is set during build when the expression rewriter | ||
| // encounters a predicate-context MATCH...AGAINST whose native form | ||
| // (FTSMysqlMatchAgainst) cannot be executed — the matched columns lack a | ||
| // public FULLTEXT index on a TiFlash-backed table, or the modifier is not | ||
| // supported by pushdown. The flag is read by the alternative-rounds driver | ||
| // after the round to invalidate the round's plan and trigger the | ||
| // fts-like-fallback round (see optimize.go). | ||
| nonViableFTSMatch bool | ||
|
|
||
| // predicateMatchSeen is set during build when the expression rewriter | ||
| // encounters a direct-boolean-context MATCH...AGAINST (one whose 0/1 boolean | ||
| // result is consumed directly as a predicate). The alternative-rounds driver | ||
| // uses this to enable the fts-like-fallback round even when round 1's | ||
| // native plan is executable, so the LIKE-based plan can compete on cost. | ||
| predicateMatchSeen bool | ||
| } | ||
|
|
||
| // HasNonViableFTSMatch reports whether the most recent build round saw a | ||
| // predicate-context MATCH...AGAINST that could not be served by the native | ||
| // FTSMysqlMatchAgainst builtin. The caller (optimize.go) uses this to | ||
| // invalidate the round's plan and trigger the fts-like-fallback round. | ||
| func (b *PlanBuilder) HasNonViableFTSMatch() bool { | ||
| return b.nonViableFTSMatch | ||
| } | ||
|
|
||
| // MarkNonViableFTSMatch records that a predicate-context MATCH...AGAINST in | ||
| // the current build cannot be served natively. See HasNonViableFTSMatch. | ||
| func (b *PlanBuilder) MarkNonViableFTSMatch() { | ||
| b.nonViableFTSMatch = true | ||
| } | ||
|
|
||
| // HasPredicateMatch reports whether the most recent build round saw a | ||
| // direct-boolean-context MATCH...AGAINST. The caller (optimize.go) uses this | ||
| // to decide whether to run the fts-like-fallback round for cost competition, | ||
| // independent of whether round 1's native plan is executable. | ||
| func (b *PlanBuilder) HasPredicateMatch() bool { | ||
| return b.predicateMatchSeen | ||
| } | ||
|
|
||
| // MarkPredicateMatch records that the current build encountered a | ||
| // direct-boolean-context MATCH...AGAINST. See HasPredicateMatch. | ||
| func (b *PlanBuilder) MarkPredicateMatch() { | ||
| b.predicateMatchSeen = true | ||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Resolve the cherry-pick conflict before merge.
This hunk still contains Git conflict markers and two competing PlanBuilder definitions, so planbuilder.go is syntactically invalid and won't compile until the conflict is resolved.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/core/planbuilder.go` around lines 322 - 371, The file contains
unresolved Git conflict markers in the PlanBuilder type area; remove the
conflict markers (<<<<<<<, =======, >>>>>>>) and reconcile the two competing
blocks by merging the SavedViews field with the new FTS-related fields
(nonViableFTSMatch, predicateMatchSeen) into a single PlanBuilder struct, and
keep the accompanying accessor/mutator methods (HasNonViableFTSMatch,
MarkNonViableFTSMatch, HasPredicateMatch, MarkPredicateMatch) only once; ensure
PlanBuilder now declares SavedViews []*ast.TableName plus the two boolean fields
and that the added methods reference that single struct definition so the file
compiles.
| <<<<<<< HEAD | ||
| // optimizeCnt is a global variable only used for test. | ||
| var optimizeCnt int | ||
|
|
||
| ======= | ||
| type logicalPlanBuildCtx struct { | ||
| stmtCtxState stmtctx.LogicalPlanBuildState | ||
| plannerSelectBlockAsName *[]ast.HintTable | ||
| mapScalarSubQ []any | ||
| mapHashCode2UniqueID map[string]int | ||
| rewritePhaseInfo variable.RewritePhaseInfo | ||
| } | ||
|
|
||
| func saveLogicalPlanBuildCtx(sessVars *variable.SessionVars) logicalPlanBuildCtx { | ||
| return logicalPlanBuildCtx{ | ||
| stmtCtxState: sessVars.StmtCtx.SaveLogicalPlanBuildState(), | ||
| plannerSelectBlockAsName: sessVars.PlannerSelectBlockAsName.Load(), | ||
| mapScalarSubQ: sessVars.MapScalarSubQ, | ||
| mapHashCode2UniqueID: sessVars.MapHashCode2UniqueID4ExtendedCol, | ||
| rewritePhaseInfo: sessVars.RewritePhaseInfo, | ||
| } | ||
| } | ||
|
|
||
| func restoreLogicalPlanBuildCtx(sessVars *variable.SessionVars, logicalPlanCtx logicalPlanBuildCtx) { | ||
| sessVars.StmtCtx.RestoreLogicalPlanBuildState(logicalPlanCtx.stmtCtxState) | ||
| sessVars.PlannerSelectBlockAsName.Store(logicalPlanCtx.plannerSelectBlockAsName) | ||
| sessVars.MapScalarSubQ = logicalPlanCtx.mapScalarSubQ | ||
| sessVars.MapHashCode2UniqueID4ExtendedCol = logicalPlanCtx.mapHashCode2UniqueID | ||
| sessVars.RewritePhaseInfo = logicalPlanCtx.rewritePhaseInfo | ||
| } | ||
|
|
||
| func buildAndOptimizeLogicalPlanRound( | ||
| ctx context.Context, | ||
| sctx planctx.PlanContext, | ||
| node *resolve.NodeW, | ||
| is infoschema.InfoSchema, | ||
| hintProcessor *hint.QBHintHandler, | ||
| checked *bool, | ||
| optimizeStarted *bool, | ||
| beginOpt *time.Time, | ||
| needRestoreLogicalPlanCtx bool, | ||
| bestPlan *base.PhysicalPlan, | ||
| bestNames *types.NameSlice, | ||
| bestCost *float64, | ||
| bestLogicalPlanCtx *logicalPlanBuildCtx, | ||
| optFlagAdjust func(uint64) uint64, | ||
| ) (base.Plan, types.NameSlice, bool, error) { | ||
| builder := planBuilderPool.Get().(*core.PlanBuilder) | ||
| defer planBuilderPool.Put(builder.ResetForReuse()) | ||
| // TODO: when buildRound > 1, only emit unused view-hint warnings for the winner build. | ||
| defer builder.HandleUnusedViewHints() | ||
|
|
||
| builder.Init(sctx, is, hintProcessor) | ||
|
|
||
| // todo: you can customize each round's special builder (like semi join rewrite or not by signal) | ||
| p, err := buildLogicalPlan(ctx, sctx, node, builder) | ||
| if err != nil { | ||
| return nil, nil, false, err | ||
| } | ||
| names := p.OutputNames() | ||
|
|
||
| if !*checked { | ||
| // Keep privilege and lock checks fail-fast. These depend on visitInfo | ||
| // produced by the logical build, but not on the later cost winner. | ||
| if pm := privilege.GetPrivilegeManager(sctx); pm != nil { | ||
| visitInfo := core.VisitInfo4PrivCheck(ctx, is, node.Node, builder.GetVisitInfo()) | ||
| if err := core.CheckPrivilege(sctx.GetSessionVars().ActiveRoles, pm, visitInfo); err != nil { | ||
| return nil, nil, false, err | ||
| } | ||
| } | ||
|
|
||
| if err := core.CheckTableLock(sctx, is, builder.GetVisitInfo()); err != nil { | ||
| return nil, nil, false, err | ||
| } | ||
|
|
||
| if err := core.CheckTableMode(node); err != nil { | ||
| return nil, nil, false, err | ||
| } | ||
| *checked = true | ||
| } | ||
|
|
||
| // Handle the non-logical plan statement. | ||
| logic, isLogicalPlan := p.(base.LogicalPlan) | ||
| if !isLogicalPlan { | ||
| return p, names, true, nil | ||
| } | ||
|
|
||
| core.RecheckCTE(logic) | ||
|
|
||
| // todo: also you can customize each round's special logical opt flag here (like decorrelate rule or not) | ||
| if !*optimizeStarted { | ||
| *optimizeStarted = true | ||
| *beginOpt = time.Now() | ||
| } | ||
| optFlag := builder.GetOptFlag() | ||
| if sctx.GetSessionVars().EnableAlternativeLogicalPlans && | ||
| optFlag&rule.FlagPushDownTopN > 0 && | ||
| optFlag&rule.FlagJoinReOrder > 0 { | ||
| sctx.GetSessionVars().StmtCtx.MarkAlternativeLogicalPlanOrderAwareJoinReorder() | ||
| } | ||
| if optFlagAdjust != nil { | ||
| optFlag = optFlagAdjust(optFlag) | ||
| } | ||
| finalPlan, cost, err := core.DoOptimize(ctx, sctx, optFlag, logic) | ||
| if err != nil { | ||
| return nil, nil, false, err | ||
| } | ||
|
|
||
| // Record predicate-context MATCH for cost competition. The fts-like-fallback | ||
| // alternative round reads this signal to decide whether to build a competing | ||
| // ILIKE-based plan alongside round 1's native plan, so the cheaper of the | ||
| // two wins via the normal alt-rounds cost comparison. | ||
| if builder.HasPredicateMatch() { | ||
| sctx.GetSessionVars().StmtCtx.AlternativeLogicalPlanHasPredicateContextMatch = true | ||
| } | ||
|
|
||
| // If this round saw a predicate-context MATCH that cannot be served by the | ||
| // native FTSMysqlMatchAgainst builtin, the produced plan would fail at | ||
| // execution. Discard it and arm AlternativeLogicalPlanFTSLikeFallback so | ||
| // any intervening rounds (correlate, etc.) re-rewrite with ILIKE too. The | ||
| // fts-like-fallback round below also forces this flag during setup; this | ||
| // outer assignment covers the non-viable case where the flag must stay | ||
| // true across all subsequent rounds, not just inside the LIKE round. | ||
| if builder.HasNonViableFTSMatch() { | ||
| sctx.GetSessionVars().StmtCtx.AlternativeLogicalPlanFTSLikeFallback = true | ||
| return p, names, false, nil | ||
| } | ||
|
|
||
| if *bestPlan == nil || cost < *bestCost { | ||
| *bestCost = cost | ||
| *bestPlan = finalPlan | ||
| *bestNames = names | ||
| if needRestoreLogicalPlanCtx { | ||
| *bestLogicalPlanCtx = saveLogicalPlanBuildCtx(sctx.GetSessionVars()) | ||
| } | ||
| } | ||
| return p, names, false, nil | ||
| } | ||
|
|
||
| // optimizeCnt is a global variable only used for test. | ||
| var optimizeCnt int | ||
|
|
||
| func shouldTryNonDecorrelationRound(sessVars *variable.SessionVars) bool { | ||
| return sessVars.EnableAlternativeLogicalPlans && | ||
| sessVars.StmtCtx.AlternativeLogicalPlanDecorrelatedApply && | ||
| !sessVars.StmtCtx.AlternativeLogicalPlanSameOrderIndexJoin | ||
| } | ||
|
|
||
| func shouldTryOrderAwareReorderRound(sessVars *variable.SessionVars) bool { | ||
| return sessVars.EnableAlternativeLogicalPlans && | ||
| sessVars.StmtCtx.AlternativeLogicalPlanOrderAwareJoinReorder | ||
| } | ||
|
|
||
| func shouldTryCorrelateRound(sessVars *variable.SessionVars) bool { | ||
| return sessVars.EnableAlternativeLogicalPlans && | ||
| sessVars.StmtCtx.AlternativeLogicalPlanPreferCorrelate | ||
| } | ||
|
|
||
| // alternativeRound describes one alternative logical-plan round. | ||
| // adjustFlag adjusts the optimization flags for the round. | ||
| // enabled returns true when the round should be attempted. | ||
| // setup/cleanup optionally modify session state before/after plan building. | ||
| type alternativeRound struct { | ||
| name string | ||
| adjustFlag func(uint64) uint64 | ||
| enabled func(*variable.SessionVars) bool | ||
| setup func(*variable.SessionVars) | ||
| cleanup func(*variable.SessionVars) | ||
| } | ||
|
|
||
| // savedEnableCorrelateSubquery holds the pre-round value of | ||
| // EnableCorrelateSubquery so setup/cleanup can share it without a closure | ||
| // wrapper. Safe because optimize is single-threaded per session. | ||
| var savedEnableCorrelateSubquery bool | ||
|
|
||
| // savedFTSLikeFallback holds the pre-round value of | ||
| // AlternativeLogicalPlanFTSLikeFallback so the fts-like-fallback round's | ||
| // setup/cleanup can restore it after running with the flag forced on. Safe | ||
| // because optimize is single-threaded per session. | ||
| var savedFTSLikeFallback bool | ||
|
|
||
| var alternativeRounds = [...]alternativeRound{ | ||
| { | ||
| name: "non-decorrelate", | ||
| adjustFlag: func(flag uint64) uint64 { return flag &^ rule.FlagDecorrelate }, | ||
| enabled: shouldTryNonDecorrelationRound, | ||
| }, | ||
| { | ||
| name: "order-aware-reorder", | ||
| adjustFlag: func(flag uint64) uint64 { return flag | rule.FlagOrderAwareJoinReorder }, | ||
| enabled: shouldTryOrderAwareReorderRound, | ||
| }, | ||
| { | ||
| name: "correlate", | ||
| adjustFlag: func(flag uint64) uint64 { return flag | rule.FlagCorrelate }, | ||
| enabled: shouldTryCorrelateRound, | ||
| setup: func(sv *variable.SessionVars) { | ||
| savedEnableCorrelateSubquery = sv.EnableCorrelateSubquery | ||
| sv.EnableCorrelateSubquery = true | ||
| }, | ||
| cleanup: func(sv *variable.SessionVars) { | ||
| sv.EnableCorrelateSubquery = savedEnableCorrelateSubquery | ||
| }, | ||
| }, | ||
| { | ||
| // fts-like-fallback: rebuild the plan rewriting predicate-context | ||
| // MATCH...AGAINST to ILIKE so it can compete with round 1's native plan | ||
| // on cost (and serve as the only valid plan when native is non-viable). | ||
| // Round 1 always uses the native builtin (same as Alt-disabled). This | ||
| // round fires whenever round 1 saw a direct-boolean-context MATCH | ||
| // (HasPredicateContextMatch) — both plans then compete via the strict-`<` | ||
| // cost comparison in buildAndOptimizeLogicalPlanRound — or whenever | ||
| // round 1 saw a MATCH whose native form cannot execute | ||
| // (FTSLikeFallback, set by the round driver after discarding round 1). | ||
| // In the discard case, round 1's plan is unavailable and this round's | ||
| // plan wins by default. | ||
| name: "fts-like-fallback", | ||
| enabled: func(sv *variable.SessionVars) bool { | ||
| if !sv.EnableAlternativeLogicalPlans { | ||
| return false | ||
| } | ||
| return sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback || | ||
| sv.StmtCtx.AlternativeLogicalPlanHasPredicateContextMatch | ||
| }, | ||
| setup: func(sv *variable.SessionVars) { | ||
| savedFTSLikeFallback = sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback | ||
| sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback = true | ||
| }, | ||
| cleanup: func(sv *variable.SessionVars) { | ||
| sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback = savedFTSLikeFallback | ||
| }, | ||
| }, | ||
| } | ||
|
|
||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Finish resolving the cherry-pick in this file.
This file still contains merge markers, so it will not compile as-is. It also leaves the new optimizer path half-merged, including references like stmtctx / rule that are not integrated cleanly yet.
Also applies to: 719-899
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/optimize.go` around lines 469 - 703, The file contains unresolved
git conflict markers (<<<<<<<, =======, >>>>>>>) leaving a half-merged optimizer
refactor; remove the conflict markers and produce a single coherent version by
keeping the new logicalPlanBuildCtx, saveLogicalPlanBuildCtx,
restoreLogicalPlanBuildCtx and buildAndOptimizeLogicalPlanRound implementations
(and only one declaration of optimizeCnt), deleting the duplicate old fragment,
and ensure references to stmtctx, rule and any new symbols are correctly
imported/used; verify alternativeRounds and related helpers
(shouldTryNonDecorrelationRound, shouldTryOrderAwareReorderRound,
shouldTryCorrelateRound, savedEnableCorrelateSubquery, savedFTSLikeFallback) are
present and consistent with the rest of the file so the file compiles.
| // savedEnableCorrelateSubquery holds the pre-round value of | ||
| // EnableCorrelateSubquery so setup/cleanup can share it without a closure | ||
| // wrapper. Safe because optimize is single-threaded per session. | ||
| var savedEnableCorrelateSubquery bool | ||
|
|
||
| // savedFTSLikeFallback holds the pre-round value of | ||
| // AlternativeLogicalPlanFTSLikeFallback so the fts-like-fallback round's | ||
| // setup/cleanup can restore it after running with the flag forced on. Safe | ||
| // because optimize is single-threaded per session. | ||
| var savedFTSLikeFallback bool | ||
|
|
||
| var alternativeRounds = [...]alternativeRound{ | ||
| { | ||
| name: "non-decorrelate", | ||
| adjustFlag: func(flag uint64) uint64 { return flag &^ rule.FlagDecorrelate }, | ||
| enabled: shouldTryNonDecorrelationRound, | ||
| }, | ||
| { | ||
| name: "order-aware-reorder", | ||
| adjustFlag: func(flag uint64) uint64 { return flag | rule.FlagOrderAwareJoinReorder }, | ||
| enabled: shouldTryOrderAwareReorderRound, | ||
| }, | ||
| { | ||
| name: "correlate", | ||
| adjustFlag: func(flag uint64) uint64 { return flag | rule.FlagCorrelate }, | ||
| enabled: shouldTryCorrelateRound, | ||
| setup: func(sv *variable.SessionVars) { | ||
| savedEnableCorrelateSubquery = sv.EnableCorrelateSubquery | ||
| sv.EnableCorrelateSubquery = true | ||
| }, | ||
| cleanup: func(sv *variable.SessionVars) { | ||
| sv.EnableCorrelateSubquery = savedEnableCorrelateSubquery | ||
| }, | ||
| }, | ||
| { | ||
| // fts-like-fallback: rebuild the plan rewriting predicate-context | ||
| // MATCH...AGAINST to ILIKE so it can compete with round 1's native plan | ||
| // on cost (and serve as the only valid plan when native is non-viable). | ||
| // Round 1 always uses the native builtin (same as Alt-disabled). This | ||
| // round fires whenever round 1 saw a direct-boolean-context MATCH | ||
| // (HasPredicateContextMatch) — both plans then compete via the strict-`<` | ||
| // cost comparison in buildAndOptimizeLogicalPlanRound — or whenever | ||
| // round 1 saw a MATCH whose native form cannot execute | ||
| // (FTSLikeFallback, set by the round driver after discarding round 1). | ||
| // In the discard case, round 1's plan is unavailable and this round's | ||
| // plan wins by default. | ||
| name: "fts-like-fallback", | ||
| enabled: func(sv *variable.SessionVars) bool { | ||
| if !sv.EnableAlternativeLogicalPlans { | ||
| return false | ||
| } | ||
| return sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback || | ||
| sv.StmtCtx.AlternativeLogicalPlanHasPredicateContextMatch | ||
| }, | ||
| setup: func(sv *variable.SessionVars) { | ||
| savedFTSLikeFallback = sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback | ||
| sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback = true | ||
| }, | ||
| cleanup: func(sv *variable.SessionVars) { | ||
| sv.StmtCtx.AlternativeLogicalPlanFTSLikeFallback = savedFTSLikeFallback | ||
| }, |
There was a problem hiding this comment.
Keep per-round saved state out of package globals.
savedEnableCorrelateSubquery and savedFTSLikeFallback are shared across all sessions. Two concurrent Optimize calls can overwrite each other's saved values and restore the wrong session state during cleanup, which is both a race and a correctness bug. Make this saved state per-optimize invocation instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/optimize.go` around lines 639 - 699, The two package-level flags
savedEnableCorrelateSubquery and savedFTSLikeFallback are unsafe because they
are shared across sessions; make the saved state local to each optimize
invocation by removing those globals and storing the saved values per-round
instead (either add fields like
savedEnableCorrelateSubquery/savedFTSLikeFallback to the alternativeRound struct
or rebuild alternativeRounds inside optimize() so each round’s setup/cleanup
closures capture local variables). Update the correlate round’s setup/cleanup
and the fts-like-fallback round’s setup/cleanup to read/write the saved value
from the per-round storage (or captured local) and ensure optimize() uses the
per-invocation alternativeRounds so concurrent Optimize calls don’t overwrite
each other.
| <<<<<<< HEAD | ||
| ======= | ||
| // AlternativeLogicalPlanDecorrelatedApply indicates whether the current logical | ||
| // optimization round decorrelated at least one Apply into Join. | ||
| AlternativeLogicalPlanDecorrelatedApply bool | ||
| // AlternativeLogicalPlanSameOrderIndexJoin indicates whether the current first | ||
| // round already produced a same-order index join candidate for a decorrelated Apply. | ||
| AlternativeLogicalPlanSameOrderIndexJoin bool | ||
| // AlternativeLogicalPlanOrderAwareJoinReorder indicates whether at least one | ||
| // logical build round produced an order-aware join reorder candidate that is | ||
| // worth exploring in a dedicated alternative round. | ||
| AlternativeLogicalPlanOrderAwareJoinReorder bool | ||
| // AlternativeLogicalPlanPreferCorrelate indicates whether the current logical | ||
| // build round encountered a non-correlated IN subquery eligible for the | ||
| // correlate-to-Apply alternative. | ||
| AlternativeLogicalPlanPreferCorrelate bool | ||
| // AlternativeLogicalPlanFTSLikeFallback is a mode flag controlling how the | ||
| // expression rewriter handles MATCH...AGAINST in predicate contexts. When | ||
| // false (the default, matching Alt-disabled behavior) the rewriter emits | ||
| // the native FTSMysqlMatchAgainst builtin. When true, the rewriter emits | ||
| // ILIKE-based predicates instead. | ||
| // | ||
| // Round 1 always runs with this flag false. The "fts-like-fallback" | ||
| // alternative round flips it to true (via its setup/cleanup) while it | ||
| // builds a competing ILIKE-based plan; the cost-cheapest plan wins via the | ||
| // normal alt-rounds cost comparison. If round 1's build records a | ||
| // predicate-context MATCH that cannot be served natively (no FTS index on a | ||
| // matched column / no TiFlash replica / modifier not pushdown-supported), | ||
| // optimize.go additionally invalidates round 1's plan and forces this flag | ||
| // true outside the round so any intervening rounds (correlate, etc.) also | ||
| // produce executable LIKE-based plans. | ||
| AlternativeLogicalPlanFTSLikeFallback bool | ||
| // AlternativeLogicalPlanHasPredicateContextMatch indicates that round 1 | ||
| // encountered a direct-boolean-context MATCH...AGAINST. The round driver | ||
| // uses this to enable the fts-like-fallback round for cost competition even | ||
| // when round 1's native plan is executable. | ||
| AlternativeLogicalPlanHasPredicateContextMatch bool | ||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Resolve the leftover cherry-pick conflict markers.
Lines 449 and 613 still contain <<<<<<< / ======= / >>>>>>>, so this file will not parse and the new StatementContext fields/methods are only half-merged.
Also applies to: 613-689
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/sessionctx/stmtctx/stmtctx.go` around lines 449 - 486, The file contains
unresolved git conflict markers (<<<<<<<, =======, >>>>>>>) around the
StatementContext additions, leaving the struct half-merged; remove the conflict
markers and ensure the full set of new fields
(AlternativeLogicalPlanDecorrelatedApply,
AlternativeLogicalPlanSameOrderIndexJoin,
AlternativeLogicalPlanOrderAwareJoinReorder,
AlternativeLogicalPlanPreferCorrelate, AlternativeLogicalPlanFTSLikeFallback,
AlternativeLogicalPlanHasPredicateContextMatch) are present exactly once in the
StatementContext definition (and any related setup/cleanup code blocks), delete
any duplicate or partial blocks from the other branch, and run go build/go vet
to verify the file parses cleanly.
| <<<<<<< HEAD | ||
| master_pos_wait | ||
| ======= | ||
| match_against | ||
| >>>>>>> f96cd1c2fd5 (planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan (#65626)) |
There was a problem hiding this comment.
Fix the unresolved merge conflict in the expected result snapshot.
These conflict markers make the SHOW BUILTINS golden output invalid, so the integration test can no longer verify a real result set. Resolve the hunk to the correct builtin list and regenerate/verify the snapshot.
As per coding guidelines, "Integration test files (tests/integrationtest/t/**) changed: record and verify regenerated result correctness".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/integrationtest/r/executor/show.result` around lines 759 - 763, The
snapshot contains unresolved merge markers between the builtin names
"master_pos_wait" and "match_against"; open the expected result file
(tests/integrationtest/r/executor/show.result), remove the conflict markers and
choose the correct builtin entry (either keep "master_pos_wait" or
"match_against" as appropriate for the current codebase), then regenerate and
re-run the integration test to verify the SHOW BUILTINS golden output is updated
and committed.
This is an automated cherry-pick of #65626
What problem does this PR solve?
Issue Number: close #68153
Problem Summary:
What changed and how does it work?
Summary
When tidb_opt_enable_alternative_logical_plans=ON, adds a fallback that rewrites MATCH ... AGAINST to case-insensitive ILIKE predicates so full-text-search queries can execute without an FTS index. The rewrite is intentionally conservative: it only fires for a strict subset of search strings in direct-boolean predicate positions, and reaches the plan only when the round-1 native path is not viable. Anything outside that envelope either keeps the native builtin (errors at execution without an FTS index) or is rejected at plan time, never silently producing wrong rows.
Architecture
Round 1 (default, matches Alt-disabled behavior) — emits the native FTSMysqlMatchAgainst builtin. The expression rewriter records nonViableFTSMatch on the PlanBuilder when a direct-boolean-context MATCH cannot be served natively (no FTS index on a TiFlash replica, or a non-pushdown-safe modifier).
Round 2: fts-like-fallback — fires only when round 1 reported a non-viable MATCH. The driver discards round 1's plan and re-runs the build with AlternativeLogicalPlanFTSLikeFallback=true, which switches the rewriter to ILIKE in direct-boolean positions. If this round also errors (e.g. unsupported search string with no FTS-index rescue), lastAltRoundErr propagates the message instead of the generic "failed to build logical plan" sentinel.
A single flag (AlternativeLogicalPlanFTSLikeFallback) drives the dispatch; viability state stays local to the build on PlanBuilder.nonViableFTSMatch.
Where the rewrite applies
inDirectMatchBooleanContext (modeled on the existing canTreatInSubqueryAsExistsForFilter) walks the AST ancestor stack and accepts only:
Everything else — MATCH ... > 0.5, MATCH ... = 0, MATCH ... IS NULL, MATCH inside CASE WHEN, arithmetic, scoring (SELECT field list, ORDER BY), etc. — keeps the native builtin, which preserves the float relevance score and errors at execution if no FTS index exists. Substituting a 0/1 integer in those positions would silently corrupt the comparison or sort.
Strict search-string subset
ValidateFTSSearchStringForLikeFallback rejects anything that would tokenize differently in MySQL FTS than a substring ILIKE match. Accepted by mode:
Rejected at plan time with error 1235 (ErrNotSupportedYet): phrases "...", prefix wildcard term*, relevance modifiers > < ~, grouping (...), mid-word punctuation like xx-yy, and any token containing %, _, , ,, ., :, etc. WITH QUERY EXPANSION is likewise rejected (no ILIKE approximation exists).
Modifier handling
The tipb pushdown protocol does not serialize the FTS modifier (see distsql_builtin.go), so a Boolean-mode or WITH QUERY EXPANSION MATCH pushed down to TiFlash would silently execute as natural-language mode. To prevent this, matchAgainstToBuiltin rejects non-default modifiers at plan time unless matchHasLikeFallbackRescue is true (alt enabled + direct-boolean context, where the alt-rounds driver will discard the native plan and rebuild via fts-like-fallback). In practice:
NULL search handling
MATCH(c) AGAINST(NULL) matches nothing in MySQL FTS semantics, but three-valued logic matters under NOT: native evalReal returns NULL, so NOT NULL = NULL filters the row. The rewrite emits Constant(NULL) rather than Constant(0) so the same semantics hold under NOT, IS NULL, etc. The plan-cache skip is set before the NULL fast-path, so a prepared statement bound to NULL first followed by a non-NULL bind re-plans and returns correct rows instead of reusing a cached constant-false plan.
Boolean-mode operator support (post-strict-subset)
Plan cache
Marks the plan non-cacheable when the AGAINST argument is mutable across executions (? parameter marker, user variable, deferred expression). Literal AGAINST values keep the plan cacheable. The skip runs before the NULL fast-path so a NULL first bind can't bake a constant plan that gets reused later.
Selectivity
BuildFTSToILikeExpressionFromBuiltin substitutes the equivalent ILIKE form for the opaque FTSMysqlMatchAgainst builtin so the round 1 native plan's cost reflects column histogram/TopN rather than the flat SelectivityFactor (0.8). Restricted to single-column MATCH because GetSelectivityByFilter declines multi-column expressions — a multi-column substitute would fall through to the same str-match default anyway.
Known semantic differences
These apply to ILIKE round queries only; the native path preserves full MySQL semantics:
Files changed
Test plan
parenthesized MATCH, scalar positions (IS NULL, > 0.5, = 0, CASE), non-default modifiers in scoring/scalar/alt-disabled contexts, prepared statements (literal cacheable,
parameter-marker non-cacheable, NULL-first-bind re-plans).
🤖 Generated with https://claude.com/claude-code
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
Release Notes
New Features
MATCH...AGAINSTqueries and FTS_MATCH_WORD operations.Improvements
Tests