planner: fix null-reject based FD not-null inference#68050
Conversation
📝 WalkthroughWalkthroughThe pull request fixes FD extraction logic by evaluating null-rejection per column rather than whole-schema basis. This prevents incorrect functional dependency inference when WHERE conditions contain OR clauses. Tests validate the fix with new negative test cases for GROUP BY queries with disjunctive WHERE predicates. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/planner/util/funcdep_misc.go (1)
44-48: Correctly fixes the over-approximation; minor redundant work per column.The per-column null-rejection check resolves the bug for predicates like
c = 1 OR d = 1— neithercnordis null-rejected individually under a single-column schema, so neither gets marked, which is the intended behavior.Minor inefficiency:
IsNullRejectedcallsexpression.PushDownNoton the predicate on every invocation (seepkg/planner/util/null_misc.go:99), so for a condition referencing K columns, the samePushDownNotis recomputed K times. You could push-down-not once per condition before the inner loop, then callproveNullRejecteddirectly. Not a correctness issue; only worth doing if profiling shows overhead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/util/funcdep_misc.go` around lines 44 - 48, The loop in funcdep_misc.go repeatedly calls IsNullRejected which internally runs expression.PushDownNot for the same condition multiple times; to avoid redundant work, compute pushed := expression.PushDownNot(condition) once before iterating cols and then call proveNullRejected(p.SCtx(), pushed, expression.NewSchema(col)) (the internal helper used by IsNullRejected) for each col, inserting int(col.UniqueID) when true; keep behavior unchanged otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/util/funcdep_misc.go`:
- Around line 44-48: The loop in funcdep_misc.go repeatedly calls IsNullRejected
which internally runs expression.PushDownNot for the same condition multiple
times; to avoid redundant work, compute pushed :=
expression.PushDownNot(condition) once before iterating cols and then call
proveNullRejected(p.SCtx(), pushed, expression.NewSchema(col)) (the internal
helper used by IsNullRejected) for each col, inserting int(col.UniqueID) when
true; keep behavior unchanged otherwise.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 66e0deb0-221c-4698-8b5c-f9786e5f574f
📒 Files selected for processing (4)
pkg/planner/funcdep/extract_fd_test.gopkg/planner/util/funcdep_misc.gotests/integrationtest/r/executor/aggregate.resulttests/integrationtest/t/executor/aggregate.test
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #68050 +/- ##
================================================
- Coverage 77.7722% 77.0215% -0.7507%
================================================
Files 1990 1972 -18
Lines 551424 556914 +5490
================================================
+ Hits 428855 428944 +89
- Misses 121649 127648 +5999
+ Partials 920 322 -598
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
3 similar comments
|
/retest |
|
/retest |
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AilinKid, qw4990 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: ref #68026
Problem Summary:
ExtractNotNullFromCondswas using a whole-schema null-reject proof for each predicate and thenmarking every referenced column as not null. For predicates like
c = 1 or d = 1, the conditionis null-rejected for the output row, but that does not prove both
canddare individually notnull. That over-strengthened FD derivation and let ONLY_FULL_GROUP_BY accept invalid queries.
What changed and how does it work?
ExtractNotNullFromCondsto check null rejection per referenced column instead of onceagainst the whole schema
pkg/planner/funcdep/extract_fd_test.gotests/integrationtest/t/executor/aggregate.testCheck List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes
GROUP BYclauses underONLY_FULL_GROUP_BYSQL mode to properly reject queries selecting non-aggregated, non-grouped columns that lack functional dependency.Tests
GROUP BYvalidation with complexWHEREpredicates.