-
Notifications
You must be signed in to change notification settings - Fork 6.2k
planner: rewrite FTS predicates to LIKE for evaluation of non-TiCI query plan #65626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
a5a6400
planner: rewrite FTS predicates to LIKE if no FTS index
terry1purcell 4ded0d5
build errors
terry1purcell 5051049
build errors2
terry1purcell a11e436
testcase1
terry1purcell 3379ed5
testcase2
terry1purcell ba0f3d7
review1
terry1purcell 82416d8
review2
terry1purcell f7b1fa5
review3
terry1purcell c63c6bb
review4
terry1purcell 0c74944
review5
terry1purcell dbdbce1
review6
terry1purcell dc10c79
review7
terry1purcell 48373d1
review8
terry1purcell 2121e81
review9
terry1purcell 0b4ba84
review10
terry1purcell 69b1497
Merge branch 'pingcap:master' into fts
terry1purcell 04b00aa
Merge branch 'master' into fts
terry1purcell ab11dda
refactor
terry1purcell c9e0409
Merge branch 'master' into fts
terry1purcell 8ba9a8a
expression: revert fts_match_word arity/impl changes not needed for L…
terry1purcell 19e9dc3
planner, expression: fix four review findings in MATCH...AGAINST LIKE…
terry1purcell 569f3aa
rebase after months of change
terry1purcell 19aba2c
planner: handle optional+excluded boolean FTS terms in LIKE fallback
terry1purcell eb1d412
planner: fix four correctness issues in MATCH...AGAINST LIKE fallback
terry1purcell 4200060
planner/util: update null-reject builtin registry snapshot for match_…
terry1purcell a566282
expression: regenerate builtin thread-safety files for builtinFtsMysq…
terry1purcell afcf285
tests: add match_against to SHOW BUILTINS expected output
terry1purcell d5cfdcb
planner, expression: fix review findings in MATCH...AGAINST LIKE fall…
terry1purcell 4080f42
planner, expression: address review findings in MATCH...AGAINST LIKE …
terry1purcell c22f54c
planner: restrict MATCH...AGAINST LIKE rewrite to predicate contexts
terry1purcell b7733c7
planner: use ILIKE for case-insensitive MATCH...AGAINST LIKE fallback
terry1purcell e603043
planner: fix gofmt comment formatting in fulltext_to_like.go
terry1purcell 1ddbcca
planner: add fts-native alternative round for TiFlash FTS cost compet…
terry1purcell dc2cccb
review updates
terry1purcell 98e97fd
Merge branch 'pingcap:master' into fts
terry1purcell 32b7c04
planner, expression: address review feedback on MATCH...AGAINST LIKE …
terry1purcell bac401f
bazel update
terry1purcell 4aa6f93
*: stop tracking Claude Code runtime state
terry1purcell 49db4da
planner: fix NULL search handling in MATCH...AGAINST LIKE fallback
terry1purcell 3647abe
expression: gate FTSMysqlMatchAgainst Flash pushdown on default modifier
terry1purcell 8803614
cardinality: route constant FTS substitutes to the constants bucket
terry1purcell b0b04c4
expression: emit Constant(NULL) for AGAINST(NULL) in selectivity subs…
terry1purcell aba6e18
planner: move column-type check above NULL fast-path in LIKE fallback
terry1purcell 57f98c4
cardinality: refresh stale Constant(0) comment after Constant(NULL) c…
terry1purcell a18872c
expression: add defensive bounds check before indexing FTS validator …
terry1purcell ed3e7a3
planner: restore FTS cost competition between native and ILIKE plans
terry1purcell File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,295 @@ | ||
| // Copyright 2025 PingCAP, Inc. | ||
|
terry1purcell marked this conversation as resolved.
Outdated
|
||
| // | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| package core | ||
|
|
||
| import ( | ||
| "strings" | ||
|
|
||
| "github.com/pingcap/tidb/pkg/expression" | ||
| "github.com/pingcap/tidb/pkg/parser/ast" | ||
| "github.com/pingcap/tidb/pkg/parser/mysql" | ||
| "github.com/pingcap/tidb/pkg/types" | ||
| ) | ||
|
|
||
| // searchTerm represents a single term in a Boolean fulltext search query | ||
| type searchTerm struct { | ||
| word string | ||
| isRequired bool // Has '+' prefix | ||
| isExcluded bool // Has '-' prefix | ||
| isPrefixMatch bool // Has '*' suffix | ||
| isPhrase bool // Wrapped in quotes | ||
| } | ||
|
|
||
| // parseBooleanSearchString parses a Boolean mode search string into individual terms | ||
| func parseBooleanSearchString(text string) []searchTerm { | ||
| var terms []searchTerm | ||
| var current strings.Builder | ||
| inQuote := false | ||
| i := 0 | ||
|
|
||
| for i < len(text) { | ||
| ch := text[i] | ||
|
|
||
| switch ch { | ||
| case '"': | ||
| if inQuote { | ||
| // End of phrase | ||
| phrase := current.String() | ||
| if phrase != "" { | ||
| terms = append(terms, searchTerm{ | ||
| word: phrase, | ||
| isPhrase: true, | ||
| }) | ||
| } | ||
| current.Reset() | ||
| inQuote = false | ||
| } else { | ||
| // Start of phrase | ||
| inQuote = true | ||
| } | ||
| i++ | ||
| case ' ', '\t', '\n', '\r': | ||
| if inQuote { | ||
| current.WriteByte(ch) | ||
| } else if current.Len() > 0 { | ||
| // End of word | ||
| word := current.String() | ||
| terms = append(terms, parseSearchTerm(word)) | ||
| current.Reset() | ||
| } | ||
| i++ | ||
| default: | ||
| current.WriteByte(ch) | ||
| i++ | ||
| } | ||
| } | ||
|
|
||
| // Handle remaining content | ||
| if current.Len() > 0 { | ||
| if inQuote { | ||
| // Unclosed quote, treat as phrase | ||
| terms = append(terms, searchTerm{ | ||
| word: current.String(), | ||
| isPhrase: true, | ||
| }) | ||
| } else { | ||
| word := current.String() | ||
| terms = append(terms, parseSearchTerm(word)) | ||
| } | ||
| } | ||
|
|
||
| return terms | ||
| } | ||
|
|
||
| // parseSearchTerm parses a single search term (not in quotes) and extracts operators | ||
| func parseSearchTerm(word string) searchTerm { | ||
| if word == "" { | ||
| return searchTerm{} | ||
| } | ||
|
|
||
| term := searchTerm{word: word} | ||
|
|
||
| // Check for leading operators | ||
| if word[0] == '+' { | ||
| term.isRequired = true | ||
| word = word[1:] | ||
| } else if word[0] == '-' { | ||
| term.isExcluded = true | ||
| word = word[1:] | ||
| } | ||
|
|
||
| // Check for trailing wildcard | ||
| if len(word) > 0 && word[len(word)-1] == '*' { | ||
| term.isPrefixMatch = true | ||
| word = word[:len(word)-1] | ||
| } | ||
|
|
||
| term.word = word | ||
| return term | ||
| } | ||
|
|
||
| // convertMatchAgainstToLike converts a MATCH...AGAINST expression to LIKE predicates | ||
| func (er *expressionRewriter) convertMatchAgainstToLike( | ||
| columns []expression.Expression, | ||
| searchText string, | ||
| modifier ast.FulltextSearchModifier, | ||
| ) (expression.Expression, error) { | ||
| if len(columns) == 0 { | ||
| return nil, expression.ErrNotSupportedYet.GenWithStackByArgs("MATCH...AGAINST with no columns") | ||
| } | ||
|
|
||
| if searchText == "" { | ||
| // Empty search string matches nothing | ||
| return &expression.Constant{ | ||
| Value: types.NewIntDatum(0), | ||
| RetType: types.NewFieldType(mysql.TypeTiny), | ||
| }, nil | ||
| } | ||
|
|
||
| var columnPredicates []expression.Expression | ||
|
|
||
| if modifier.IsBooleanMode() { | ||
| // Parse Boolean mode search string | ||
| terms := parseBooleanSearchString(searchText) | ||
| if len(terms) == 0 { | ||
| return &expression.Constant{ | ||
| Value: types.NewIntDatum(0), | ||
| RetType: types.NewFieldType(mysql.TypeTiny), | ||
| }, nil | ||
| } | ||
|
|
||
| // Group terms by type | ||
| var required, excluded, optional []searchTerm | ||
| for _, term := range terms { | ||
| if term.word == "" { | ||
| continue | ||
| } | ||
| if term.isRequired { | ||
| required = append(required, term) | ||
| } else if term.isExcluded { | ||
| excluded = append(excluded, term) | ||
| } else { | ||
| optional = append(optional, term) | ||
| } | ||
| } | ||
|
|
||
| // Build predicates for each column | ||
| for _, column := range columns { | ||
| var predicates []expression.Expression | ||
|
|
||
| // AND all required terms | ||
| for _, term := range required { | ||
| pred, err := er.buildLikePredicate(column, term.word, false, term.isPrefixMatch, term.isPhrase) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| predicates = append(predicates, pred) | ||
| } | ||
|
|
||
| // AND NOT all excluded terms | ||
| for _, term := range excluded { | ||
| pred, err := er.buildLikePredicate(column, term.word, true, term.isPrefixMatch, term.isPhrase) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| predicates = append(predicates, pred) | ||
| } | ||
|
|
||
| // OR all optional terms (if any) | ||
| if len(optional) > 0 { | ||
| var optionalPreds []expression.Expression | ||
| for _, term := range optional { | ||
| pred, err := er.buildLikePredicate(column, term.word, false, term.isPrefixMatch, term.isPhrase) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| optionalPreds = append(optionalPreds, pred) | ||
| } | ||
| if len(optionalPreds) > 0 { | ||
| predicates = append(predicates, expression.ComposeDNFCondition(er.sctx, optionalPreds...)) | ||
| } | ||
| } | ||
|
|
||
| // If we have any predicates for this column, combine them with AND | ||
| if len(predicates) > 0 { | ||
| columnPredicates = append(columnPredicates, expression.ComposeCNFCondition(er.sctx, predicates...)) | ||
| } | ||
| } | ||
| } else { | ||
| // Natural Language Mode: split into words and OR them together | ||
| words := strings.Fields(searchText) | ||
| if len(words) == 0 { | ||
| return &expression.Constant{ | ||
| Value: types.NewIntDatum(0), | ||
| RetType: types.NewFieldType(mysql.TypeTiny), | ||
| }, nil | ||
| } | ||
|
|
||
| for _, column := range columns { | ||
| var wordPredicates []expression.Expression | ||
| for _, word := range words { | ||
| pred, err := er.buildLikePredicate(column, word, false, false, false) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| wordPredicates = append(wordPredicates, pred) | ||
| } | ||
| if len(wordPredicates) > 0 { | ||
| columnPredicates = append(columnPredicates, expression.ComposeDNFCondition(er.sctx, wordPredicates...)) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // OR across all columns | ||
| if len(columnPredicates) == 0 { | ||
| return &expression.Constant{ | ||
| Value: types.NewIntDatum(0), | ||
| RetType: types.NewFieldType(mysql.TypeTiny), | ||
| }, nil | ||
| } | ||
|
|
||
| return expression.ComposeDNFCondition(er.sctx, columnPredicates...), nil | ||
| } | ||
|
|
||
| // buildLikePredicate builds a single LIKE predicate for a column and search term | ||
| func (er *expressionRewriter) buildLikePredicate( | ||
| column expression.Expression, | ||
| term string, | ||
| isNegated bool, | ||
| isPrefixMatch bool, | ||
|
terry1purcell marked this conversation as resolved.
Outdated
|
||
| isPhrase bool, | ||
| ) (expression.Expression, error) { | ||
| // Build the pattern | ||
| var pattern string | ||
| if isPhrase { | ||
| // Exact phrase: %term% | ||
| pattern = "%" + term + "%" | ||
| } else if isPrefixMatch { | ||
| // Prefix match: term% | ||
| pattern = term + "%" | ||
| } else { | ||
| // General match: %term% | ||
| pattern = "%" + term + "%" | ||
| } | ||
|
terry1purcell marked this conversation as resolved.
Outdated
|
||
|
|
||
| // Create constant for pattern | ||
| patternConst := &expression.Constant{ | ||
| Value: types.NewStringDatum(pattern), | ||
| RetType: types.NewFieldType(mysql.TypeVarchar), | ||
| } | ||
|
|
||
| // Create escape constant (backslash = 92) | ||
| escapeConst := &expression.Constant{ | ||
| Value: types.NewIntDatum(92), | ||
| RetType: types.NewFieldType(mysql.TypeTiny), | ||
| } | ||
|
|
||
| // Build LIKE function | ||
| likeFunc, err := er.newFunction(ast.Like, types.NewFieldType(mysql.TypeTiny), column, patternConst, escapeConst) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| // Apply NOT if needed | ||
| if isNegated { | ||
| notFunc, err := er.newFunction(ast.UnaryNot, types.NewFieldType(mysql.TypeTiny), likeFunc) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| return notFunc, nil | ||
| } | ||
|
|
||
| return likeFunc, nil | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nil check chain at lines 2225-2226 could potentially panic if any of the intermediate values are nil. While the fallback to 'like' at line 2228 provides a default, the chained dereferencing (er.planCtx.builder.ctx.GetSessionVars()) could panic if 'builder' or 'ctx' is nil, before reaching the else block. Consider checking each level individually or restructuring the condition to be safer.