Add bytesarray support to the c-api by bschoenmaeckers · Pull Request #8009 · RustPython/RustPython

bschoenmaeckers · 2026-06-01T21:57:46Z

Yet another AI assisted batch of c-api functions.

Summary by CodeRabbit

New Features
- Added C-API support for bytearray: create from raw data or Python objects, query size, access mutable buffer, and resize dynamically.
Chores
- Exposed bytearray C-API module and made internal adjustments to enable safe resizing and pointer interoperability.

coderabbitai · 2026-06-01T21:57:58Z

📝 Walkthrough

Walkthrough

Adds CPython-compatible C-API bytearray bindings: extern FFI functions for construction, conversion, sizing, buffer access, and resize; makes VM resize public; adds FFI pointer conversion and exports the new module.

Changes

C-API bytearray bindings

Layer / File(s)	Summary
Public resize method and FFI pointer trait `crates/vm/src/builtins/bytearray.rs`, `crates/capi/src/util.rs`	`PyByteArray::resize` is made public for C-API use, and `FfiResult<mut c_char>` is implemented for `mut u8` to enable mutable pointer returns in FFI.
C-API bytearray bindings and tests `crates/capi/src/bytearrayobject.rs`	Extern "C" functions implement `PyByteArray_FromStringAndSize` (handles NULL and negative len, allocates/copies buffer), `PyByteArray_FromObject` (converts object to bytes), `PyByteArray_Size`, `PyByteArray_AsString` (returns mutable buffer pointer), and `PyByteArray_Resize` (delegates to VM). A disabled test module is included.
Module export `crates/capi/src/lib.rs`	Adds `pub mod bytearrayobject;` to expose the new C-API module.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

youknowone
ShaharNaveh

🐰 I hopped through bytes with glee,
Rust cords stitched to C,
Buffers stretch and pointers flee,
Bytearrays dance for free,
A tiny rabbit's FFI spree.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add bytesarray support to the c-api' clearly and directly summarizes the main change: adding C-API functions for bytearray operations.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/capi/src/bytearrayobject.rs`:
- Around line 21-24: The code path in bytearrayobject.rs that handles
bytes.is_null() calls unsafe set_len which exposes uninitialized memory; change
it to allocate and zero-initialize the buffer instead (e.g., create a Vec<u8>
filled with zeros of length len or allocate then resize/fill with 0) so the
produced PyByteArray matches CPython's zero-filled semantics; update the branch
that currently creates "let mut data = Vec::with_capacity(len); unsafe {
data.set_len(len) }" to produce a zeroed Vec and then remove the
#[allow(clippy::uninit_vec)] since uninitialized vec usage will no longer be
present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7a1301fd-f873-44d2-a8d0-ee7b54ab6bb9

📥 Commits

Reviewing files that changed from the base of the PR and between 885cf5c and 05cec26.

📒 Files selected for processing (4)

crates/capi/src/bytearrayobject.rs
crates/capi/src/lib.rs
crates/capi/src/util.rs
crates/vm/src/builtins/bytearray.rs

youknowone · 2026-06-02T06:33:23Z

+            let mut data = Vec::with_capacity(len);
+            unsafe { data.set_len(len) };


is this safe? to be safe, buffer must be written before read. this code doesn't seem to ensure that.

This is unsafe indeed, but it is the responsibility of the caller to write the data before reading it.

See for example this snipped in PyO3.
https://github.com/PyO3/pyo3/blob/b2163a0916db9fe1baef3bcd3e224531dce508fe/src/types/bytearray.rs#L78-L97

pub fn new_with<F>(py: Python<'_>, len: usize, init: F) -> PyResult<Bound<'_, PyByteArray>> where F: FnOnce(&mut [u8]) -> PyResult<()>, { unsafe { // Allocate buffer and check for an error let pybytearray: Bound<'_, Self> = ffi::PyByteArray_FromStringAndSize(core::ptr::null(), len as ffi::Py_ssize_t) .assume_owned_or_err(py)? .cast_into_unchecked(); let buffer: *mut u8 = ffi::PyByteArray_AsString(pybytearray.as_ptr()).cast(); debug_assert!(!buffer.is_null()); // Zero-initialise the uninitialised bytearray core::ptr::write_bytes(buffer, 0u8, len); // (Further) Initialise the bytearray in init // If init returns an Err, pypybytearray will automatically deallocate the buffer init(core::slice::from_raw_parts_mut(buffer, len)).map(|_| pybytearray) } }

We could also initialise it with zero ourselves, but this writes the data 2 times, which is wasteful in my eyes.

Please add a # Safety section to this function about this decision. The section needs to include:

This is memory unsafe in certain conditions.

And describe the conditions.

Why we decide take this decision.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/capi/src/bytearrayobject.rs`:
- Around line 25-34: The C-API requires PyByteArray_AsString to return a buffer
with a trailing NUL at buf[len], but current implementation stores exactly len
bytes in PyBytesInner::elements and returns pointers into that Vec, breaking the
invariant (especially for len==0). Fix by ensuring the backing allocation always
has one extra sentinel byte: change PyBytesInner storage/management to allocate
elements with capacity/len+1 (store logical length separately if needed), update
PyByteArray_FromStringAndSize to allocate len+1 and set elements[len]=0, make
PyByteArray_AsString/borrow_buf_mut return pointer into that buffer
(guaranteeing ptr[len]==0), and update PyByteArray_Resize/resize to preserve or
reallocate the extra terminator (when shrinking keep
elements.len()==logical_len+1 and set terminator; when growing reserve len+1 and
set new terminator). Modify references to PyBytesInner::elements accordingly so
Python-visible length remains unchanged while the internal buffer includes the
extra NUL.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 500801e0-537b-4045-a294-b1d4580378be

📥 Commits

Reviewing files that changed from the base of the PR and between 05cec26 and d95e102.

📒 Files selected for processing (4)

crates/capi/src/bytearrayobject.rs
crates/capi/src/lib.rs
crates/capi/src/util.rs
crates/vm/src/builtins/bytearray.rs

🚧 Files skipped from review as they are similar to previous changes (3)

crates/capi/src/lib.rs
crates/vm/src/builtins/bytearray.rs
crates/capi/src/util.rs

youknowone

Thanks!

bschoenmaeckers force-pushed the c-api-bytesarray branch from c32543e to 05cec26 Compare June 1, 2026 21:58

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread crates/capi/src/bytearrayobject.rs

youknowone requested changes Jun 2, 2026

View reviewed changes

Add bytesarray support to the c-api

d809ce7

bschoenmaeckers force-pushed the c-api-bytesarray branch from 05cec26 to 210880b Compare June 2, 2026 09:27

Add SAFETY comment

d95e102

bschoenmaeckers force-pushed the c-api-bytesarray branch from 210880b to d95e102 Compare June 2, 2026 09:28

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread crates/capi/src/bytearrayobject.rs

youknowone approved these changes Jun 3, 2026

View reviewed changes

youknowone merged commit 37fd1dd into RustPython:main Jun 3, 2026
26 checks passed

bschoenmaeckers deleted the c-api-bytesarray branch June 3, 2026 09:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bytesarray support to the c-api#8009

Add bytesarray support to the c-api#8009
youknowone merged 2 commits into
RustPython:mainfrom
bschoenmaeckers:c-api-bytesarray

bschoenmaeckers commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

youknowone Jun 2, 2026

Uh oh!

bschoenmaeckers Jun 2, 2026

Uh oh!

youknowone Jun 2, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

youknowone left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		let mut data = Vec::with_capacity(len);
		unsafe { data.set_len(len) };

Conversation

bschoenmaeckers commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

youknowone Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

bschoenmaeckers Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

youknowone Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

youknowone left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bschoenmaeckers commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading