-
Notifications
You must be signed in to change notification settings - Fork 455
Open
Description
Motivation
GPU backends (FAISS GPU, cuVS, Metal) need contiguous float32 buffers, but zvec stores vectors in IndexProvider segments. Currently there's no bridge to stream vectors from storage into GPU-ready memory.
Proposed approach
A new C++ class GpuBufferLoader (src/ailego/gpu/gpu_buffer_loader.h) that:
- Iterates over an
IndexProvidervia its existingIteratorAPI - Deserializes
ForwardDatainto float32 vectors - Fills a contiguous host buffer suitable for GPU upload
- Reports progress (vector count, bytes loaded)
IndexProvider → Iterator → ForwardData → float32 buffer → GPU
This stays within zvec's existing storage architecture — no standalone databases or new storage engines.
Also includes a docs/METAL_CPP.md documenting the Metal kernel architecture and GPU integration points.
Questions for maintainers
- Is
IndexProvider::Iteratorthe right abstraction for bulk vector extraction, or is there a better path? - Should the buffer loader live under
src/ailego/gpu/or somewhere else? - Any concerns about memory usage for large collections? (Current approach loads everything into host RAM before GPU upload — we could add batched streaming.)
Draft implementation: #175
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Backlog