Arrow

A lightweight vector database built in Rust with persistent storage.

Features

Efficient vector similarity search
Persistent JSON storage
UUID-based document identification
Support for associating vectors with filenames
Simple CLI interface for common operations

Installation

Clone the repository:

git clone https://github.com/varadanvk/arrow.git
cd arrow

Build the project:

cargo build --release

Run the executable:

./target/release/arrow

CLI Usage

The CLI provides several commands to interact with the vector database:

Global Options

-d, --database <PATH>: Specify the path to the vector store file (default: vector_store.json)
-h, --help: Print help information
-V, --version: Print version information

Commands

Create a new vector store

arrow create [OPTIONS]

Options:

-m, --max-connections <NUM>: Maximum connections per node (default: 16)

Example:

arrow create --max-connections 32

Add documents to the vector store

arrow add <FILES>...

Example:

arrow add document1.txt document2.txt

This will:

Read the text from each file
Split it into chunks (max 512 characters each)
Generate embeddings using the All-MiniLM-L6-v2 model
Add each chunk with its embedding to the vector store
Save the updated vector store to disk

Query the vector store

arrow query [OPTIONS] <TEXT>

Options:

-t, --top-k <NUM>: Number of results to return (default: 5)

Example:

arrow query "What is a monopoly business?" --top-k 3

List documents in the vector store

arrow list [OPTIONS]

Options:

-l, --limit <NUM>: Maximum number of documents to list (default: 10)

Example:

arrow list --limit 20

Show vector store information

arrow info

This displays:

The location of the vector store
The number of documents
The source files

Architecture

Arrow consists of two main components:

VectorStore: A hierarchical navigable small-world (HNSW) graph-based vector index with:
- Multiple layers for efficient navigation
- Configurable maximum connections per node
- UUID-based document identification
Embeddor: A text embedding module that:
- Uses Hugging Face's Rust implementation of All-MiniLM-L6-v2
- Supports chunking of long texts
- Processes embeddings in parallel for better performance

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
arrow.rb		arrow.rb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Arrow

Features

Installation

CLI Usage

Global Options

Commands

Create a new vector store

Add documents to the vector store

Query the vector store

List documents in the vector store

Show vector store information

Architecture

License

Contributing

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

varadanvk/arrow

Folders and files

Latest commit

History

Repository files navigation

Arrow

Features

Installation

CLI Usage

Global Options

Commands

Create a new vector store

Add documents to the vector store

Query the vector store

List documents in the vector store

Show vector store information

Architecture

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages