Skip to content

Proposal: GPU buffer loader for IndexProvider integration #179

@cluster2600

Description

@cluster2600

Motivation

GPU backends (FAISS GPU, cuVS, Metal) need contiguous float32 buffers, but zvec stores vectors in IndexProvider segments. Currently there's no bridge to stream vectors from storage into GPU-ready memory.

Proposed approach

A new C++ class GpuBufferLoader (src/ailego/gpu/gpu_buffer_loader.h) that:

  1. Iterates over an IndexProvider via its existing Iterator API
  2. Deserializes ForwardData into float32 vectors
  3. Fills a contiguous host buffer suitable for GPU upload
  4. Reports progress (vector count, bytes loaded)
IndexProvider → Iterator → ForwardData → float32 buffer → GPU

This stays within zvec's existing storage architecture — no standalone databases or new storage engines.

Also includes a docs/METAL_CPP.md documenting the Metal kernel architecture and GPU integration points.

Questions for maintainers

  1. Is IndexProvider::Iterator the right abstraction for bulk vector extraction, or is there a better path?
  2. Should the buffer loader live under src/ailego/gpu/ or somewhere else?
  3. Any concerns about memory usage for large collections? (Current approach loads everything into host RAM before GPU upload — we could add batched streaming.)

Draft implementation: #175

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions