Skip to content

Conversation

@mykaul
Copy link

@mykaul mykaul commented Jan 9, 2026

Optimize buffer management in _ConnectionIOBuffer to avoid unnecessary byte allocations during high-throughput reads.

  1. Buffer Compaction: Replaced io.BytesIO(buffer.read()) with a getbuffer() slice. The previous method read() allocated a new bytes object for the remaining content before creating the new generic BytesIO. The new approach uses a zero-copy memoryview slice for initialization.

  2. Header Peeking: Replaced getvalue() in _read_frame_header with getbuffer(). This allows inspecting the protocol version and frame length without materializing the entire buffer contents into a new bytes string.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

Optimize buffer management in `_ConnectionIOBuffer` to avoid unnecessary byte
allocations during high-throughput reads.

1.  **Buffer Compaction**: Replaced `io.BytesIO(buffer.read())` with a
    `getbuffer()` slice. The previous method `read()` allocated a new `bytes`
    object for the remaining content before creating the new generic `BytesIO`.
    The new approach uses a zero-copy memoryview slice for initialization.

2.  **Header Peeking**: Replaced `getvalue()` in `_read_frame_header` with
    `getbuffer()`. This allows inspecting the protocol version and frame length
    without materializing the entire buffer contents into a new `bytes` string.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
@mykaul mykaul added the enhancement New feature or request label Jan 9, 2026
@mykaul mykaul marked this pull request as draft January 9, 2026 19:22
@mykaul mykaul requested a review from Copilot January 9, 2026 19:23
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes buffer management in the connection receive path to reduce memory allocations during high-throughput reads. The changes leverage Python's memoryview capabilities to avoid unnecessary byte copies in two critical hot-path operations.

Key Changes:

  • Introduced a _reset_buffer static method that uses getbuffer() slicing instead of read() for buffer compaction
  • Replaced getvalue() with getbuffer() in _read_frame_header for zero-copy header inspection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mykaul added 2 commits January 9, 2026 22:33
Replace BytesIO.read() with direct buffer slicing to eliminate one
intermediate bytes allocation per received message frame.

Changes:
- Use getbuffer() to get memoryview of underlying buffer
- Slice directly at [body_offset:end_pos] instead of seek+read
- Convert memoryview slice to bytes in single operation
- Maintain buffer position tracking for proper reset behavior

Benefits:
- Eliminates one full-frame allocation on hot receive path
- Maintains compatibility with existing protocol decoder

The memoryview is immediately converted to bytes and released,
preventing buffer resize issues while still gaining the
allocation savings.

Signed-off-by: Yaniv Kaul <ykaul@scylladb.com>
Introduce a lightweight BytesReader class that provides the same
read() interface as io.BytesIO but operates directly on the input
bytes/memoryview without internal buffering overhead.

Changes:
- Add BytesReader class with __slots__ for memory efficiency
- Replace io.BytesIO(body) with BytesReader(body) in decode_message()
- BytesReader.read() returns slices directly, converting memoryview
  to bytes only when necessary for compatibility

Benefits:
- Eliminates BytesIO's internal buffer allocation and management
- Reduces memory overhead for protocol message decoding
- Works seamlessly with both bytes and memoryview inputs
- Maintains full API compatibility with existing read_* functions

The BytesReader is a minimal implementation focused on the read()
method needed by the protocol decoder. It avoids the overhead of
io.BytesIO's full file-like interface.

Signed-off-by: Yaniv Kaul <ykaul@scylladb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant