expression: fix isBinCollation to recognize utf8mb4_0900_bin#68846
expression: fix isBinCollation to recognize utf8mb4_0900_bin#68846tiancaiamao wants to merge 2 commits into
Conversation
…Collation The isBinCollation function did not include utf8mb4_0900_bin (MySQL 8.0 binary collation, ID 309). When inferCollation aggregates a column using utf8mb4_0900_bin with another non-bin utf8mb4 collation, the result incorrectly degrades to CoercibilityNone instead of keeping the _bin collation. This causes subsequent aggregation with a binary/blob column to produce a from_binary() cast that fails on non-UTF-8 bytes (ERROR 3854). Fix: add charset.CollationUTF8MB40900Bin to isBinCollation and add regression tests covering the two-column and three-column cases. Closes #68845
|
@tiancaiamao I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
|
Hi @tiancaiamao. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds an exported constant for ChangesUTF8MB4 0900 Binary Collation Support
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Tools execution failed with the following error: Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #68846 +/- ##
================================================
- Coverage 76.3105% 75.3104% -1.0001%
================================================
Files 2041 2025 -16
Lines 563393 568815 +5422
================================================
- Hits 429928 428377 -1551
- Misses 132549 140347 +7798
+ Partials 916 91 -825
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
|
@tiancaiamao: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
|
@tiancaiamao: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
|
@tiancaiamao: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What problem does this PR solve?
Issue Number: close #68845
Problem Summary:
CONCAT_WSover autf8mb4_0900_bincolumn + autf8mb4_unicode_cicolumn + amediumblobcolumn fails withERROR 3854 (HY000): Cannot convert string from binary to utf8mb4. The root cause is thatisBinCollation()inpkg/expression/collation.godoes not recognizeutf8mb4_0900_binas a binary collation, causinginferCollationto degrade toCoercibilityNoneinstead of keeping the_bincollation. This leads to an incorrectfrom_binary()cast on the blob.What changed and how does it work?
CollationUTF8MB40900Bin = "utf8mb4_0900_bin"constant topkg/parser/charset/charset.go.charset.CollationUTF8MB40900Binto theisBinCollation()check inpkg/expression/collation.go.pkg/expression/collation_test.go:utf8mb4_0900_binvsutf8mb4_unicode_ci(both orders) →_binwinsutf8mb4_0900_bin+utf8mb4_unicode_ci+binary) →binarywins withoutfrom_binary()castCheck List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes
Tests
Documentation