Skip to content

Comments

Feature/hnsw vector search#187

Open
DenisovAV wants to merge 2 commits intomainfrom
feature/hnsw-vector-search
Open

Feature/hnsw vector search#187
DenisovAV wants to merge 2 commits intomainfrom
feature/hnsw-vector-search

Conversation

@DenisovAV
Copy link
Owner

No description provided.

Renamed SentencePiece's bundled protobuf namespace from google::protobuf
to google::protobuf_sp to avoid symbol conflicts with MediaPipe's protobuf.

Root cause: Both SentencePiece and MediaPipe exported google::protobuf::*
symbols. At link time, the linker arbitrarily chose one implementation,
causing memory corruption when mismatched vtables were used.

Changes:
- Renamed namespace in port_def.inc: google::protobuf -> google::protobuf_sp
- Updated all 75 protobuf-lite source files with new namespace
- Removed unused protobuf_namespace.h
- Cleaned up podspec preprocessor definitions

Fixes: #184
Implements Hierarchical Navigable Small World (HNSW) algorithm for
fast approximate nearest neighbor search in VectorStore.

Architecture:
- SQLite remains source of truth (persistence)
- HNSW serves as in-memory cache (fast search)
- Hybrid search: HNSW candidates -> exact similarity recalculation
- Threshold: HNSW used when document count >= 100

Changes:
- Added local_hnsw dependency (pure Dart, cross-platform)
- Created HnswVectorIndex wrapper with add/search/rebuild/clear
- Added Pigeon methods: getAllDocumentsWithEmbeddings, getDocumentsByIds
- Implemented native methods in Android (Kotlin), iOS (Swift), Web (JS)
- Updated MobileVectorStoreRepository and WebVectorStoreRepository
- Added 21 unit tests for HnswVectorIndex
- Added 5 HNSW vs brute-force parity tests

Performance:
- Search complexity: O(log n) vs O(n) brute-force
- Index rebuilt on initialize() from SQLite data
- Documents synced to both SQLite and HNSW on add
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an in-memory HNSW index on top of the existing SQLite-backed VectorStore to speed up similarity search, plus new cross-platform APIs to bulk-load embeddings (for rebuild) and fetch documents by ID (for candidate hydration).

Changes:

  • Introduce HnswVectorIndex (Dart) and integrate it into web + mobile repositories with rebuild-on-initialize and hybrid search flow.
  • Add new platform APIs: getAllDocumentsWithEmbeddings and getDocumentsByIds across Web (JS worker), Android (Kotlin), and iOS (Swift) + Pigeon plumbing.
  • Adjust iOS sentencepiece/protobuf-lite namespace to avoid symbol conflicts (protobuf → protobuf_sp), and add HNSW-focused tests.

Reviewed changes

Copilot reviewed 97 out of 102 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
web/sqlite_vector_store.js Web proxy: add worker fetch-via-Blob and expose new worker methods for HNSW rebuild/hydration.
web/rag/sqlite_vector_store_worker.js Worker: implement getAllDocumentsWithEmbeddings + getDocumentsByIds and wire message handler cases.
web/rag/sqlite_vector_store.js Web proxy (rag): expose getAllDocumentsWithEmbeddings + getDocumentsByIds.
test/vector_store_parity_test.dart Add parity tests comparing HNSW results vs brute-force cosine similarity.
test/hnsw_index_test.dart New unit tests for HnswVectorIndex behaviors (add/search/rebuild/remove/threshold).
pubspec.yaml Add local_hnsw dependency.
pubspec.lock Lockfile update for local_hnsw.
pigeon.dart Add Pigeon APIs + DocumentWithEmbedding model for HNSW rebuild.
lib/web/vector_store_web.dart Extend JS interop with new methods + Dart-friendly parsers.
lib/pigeon.g.dart Regenerated Pigeon Dart bindings for new APIs/types.
lib/core/infrastructure/web_vector_store_repository.dart Integrate HNSW cache layer and hybrid search flow on web.
lib/core/infrastructure/mobile_vector_store_repository.dart Integrate HNSW cache layer and hybrid search flow on mobile.
lib/core/infrastructure/hnsw_vector_index.dart New HNSW wrapper around local_hnsw with over-fetch + rerank.
ios/flutter_gemma.podspec Update protobuf conflict handling approach (namespace rename moved into sources).
ios/Classes/sentencepiece/third_party/protobuf-lite/zero_copy_stream_impl_lite.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/zero_copy_stream_impl.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/zero_copy_stream.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/wire_format_lite.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/time.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/strutil.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/structurally_valid.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/stringprintf.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/stringpiece.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/statusor.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/status.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/repeated_field.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/parse_context.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/message_lite.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/io_win32.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/int128.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/implicit_weak_message.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/wire_format_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/unknown_field_set.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/time.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/strutil.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/stringprintf.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/stringpiece.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/stl_util.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/statusor.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/status.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/port.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/once.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/mutex.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/map_util.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/macros.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/logging.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/int128.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/hash.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/common.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/casts.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/callback.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/stubs/bytestream.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/repeated_field.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/port_def.inc Change PROTOBUF_NAMESPACE macros to google::protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/parse_context.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/metadata_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/message_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/map_type_handler.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/map_field_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/map_entry_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/map.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/io/zero_copy_stream_impl_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/io/zero_copy_stream_impl.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/io/zero_copy_stream.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/io/io_win32.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/io/coded_stream.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/implicit_weak_message.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/has_bits.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/generated_message_util.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/generated_message_table_driven_lite.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/generated_message_table_driven.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/generated_enum_util.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/generated_enum_reflection.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/extension_set_inl.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/extension_set.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/descriptor.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/arenastring.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/arena_impl.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/arena.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/google/protobuf/any.h Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/generated_message_util.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/generated_message_table_driven_lite.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/generated_enum_util.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/extension_set.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/common.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/coded_stream.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/bytestream.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/arenastring.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/third_party/protobuf-lite/arena.cc Rename protobuf namespace to protobuf_sp.
ios/Classes/sentencepiece/src/init.cc Update shutdown call to google::protobuf_sp::ShutdownProtobufLibrary().
ios/Classes/sentencepiece/protobuf_namespace.h Remove old macro-based namespace renaming header.
ios/Classes/VectorStore.swift Add iOS implementations of new bulk embedding fetch + fetch-by-IDs APIs.
ios/Classes/PigeonInterface.g.swift Regenerated Pigeon Swift bindings for new APIs/types.
ios/Classes/FlutterGemmaPlugin.swift Implement new Pigeon calls for iOS platform service.
example/web/sqlite_vector_store.js Mirror web proxy updates for the example app.
example/pubspec.lock Example lockfile update (package version + local_hnsw transitive).
example/ios/Podfile.lock Example iOS lockfile update for plugin version.
android/src/main/kotlin/dev/flutterberlin/flutter_gemma/VectorStore.kt Add Android implementations of new bulk embedding fetch + fetch-by-IDs APIs.
android/src/main/kotlin/dev/flutterberlin/flutter_gemma/PigeonInterface.g.kt Regenerated Pigeon Kotlin bindings for new APIs/types.
android/src/main/kotlin/dev/flutterberlin/flutter_gemma/FlutterGemmaPlugin.kt Implement new Pigeon calls for Android platform service.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant