Skip to content

fix: read cassette test sources as utf-8#5750

Open
Ghraven wants to merge 1 commit into
pydantic:mainfrom
Ghraven:fix-check-cassettes-utf8
Open

fix: read cassette test sources as utf-8#5750
Ghraven wants to merge 1 commit into
pydantic:mainfrom
Ghraven:fix-check-cassettes-utf8

Conversation

@Ghraven
Copy link
Copy Markdown

@Ghraven Ghraven commented Jun 1, 2026

What changed

  • Read Python test files in scripts/check_cassettes.py with explicit encoding='utf-8' before AST parsing.
  • Added a focused test that parses a VCR-marked test file containing non-ASCII source text.

Problem

Path.read_text() uses the platform default encoding when no encoding is provided. On non-UTF-8 locales, the cassette checker can fail before AST parsing if a test file contains UTF-8-only text.

Before/after

Before: the cassette checker depended on the process default encoding when reading test source files.
After: it consistently decodes test source as UTF-8, matching the repository's Python source encoding expectations.

Verification

  • Direct helper verification with a temporary UTF-8 test file containing Olá, 世界
  • Attempted python -m pytest tests/test_check_cassettes.py, but this local lightweight environment is missing optional dependencies needed by the repo-level pytest config (opentelemetry import while resolving warning filters), so pytest stopped before collecting the new test.

@github-actions github-actions Bot added size: S Small PR (≤100 weighted lines) chore labels Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore size: S Small PR (≤100 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant