Warning
Proof of Concept & Research Only
This project is strictly for educational and research purposes. It is designed to demonstrate the implementation of distributed consensus, vector indexing, and bi-temporal data management in Rust.
It is not intended for production use, and active development is not guaranteed. There are no warranties regarding data integrity or security.
ChronosDB is a high-performance distributed database built in Rust. It uniquely combines Vector Similarity Search (for AI applications) with Bi-Temporal Data Management (Time Travel) and Distributed Consensus (Raft).
It solves the problem of "lossy" AI memory by ensuring that every vector embedding ever written is preserved in a strictly append-only history, reachable via time-travel queries.
- Distributed Consensus (Raft): Built on
openraft, ensuring linearizable writes and automatic leader election. Supports dynamic membership changes (adding/removing nodes) without downtime. - Vector Search Engine: Custom implementation of HNSW (Hierarchical Navigable Small World) graph for approximate nearest neighbor search. Supports SIMD-optimized Euclidean and Cosine distance metrics.
- Time Travel (History): Every record maintains a
valid_timeandtx_time. Data is never overwritten; it is appended. You can query the full history of any object using theHISTORYcommand. - Disk-Based Architecture: Uses memory-mapped files (
mmap) for managing storage segments. It supports both "Strict Durability" (fsync) and "High Throughput" (async) modes based on hardware detection. - Zero-Copy Serialization: Utilizes
rkyvfor guaranteeing data layout alignment and zero-copy deserialization from disk. - Binary Snapshots: Uses
rkyvfor compact, high-speed binary snapshots, significantly reducing storage size and recovery time compared to JSON. - Probabilistic Filtering: Implements Bloom Filters and SeaHash to minimize disk reads for non-existent keys.
ChronosDB uses an immutable, append-only log structure to ensure crash safety and historical retention.
- Segments: Data is written to 64MB memory-mapped file segments.
- Records: Each entry contains a 128-dimensional vector, a binary payload, and temporal metadata.
- Safety: The system detects concurrency levels and adjusts
fsyncbehavior automatically via system profiling.
Vector indexing is handled by a persistent HNSW graph that updates in real-time.
- Nodes: Graph nodes are stored in a separate optimized index file.
- Search: Uses a priority queue-based search (beam search) to traverse graph layers.
- Recall: Tuned with
M=16andef_construction=100for high recall on random data distributions.
- Replication: Logs are replicated to a quorum of nodes via the Raft protocol.
- Snapshotting: Supports auto-snapshotting based on log length (default: every 20 logs) to compact the Write Ahead Log (WAL). Snapshots are serialized in an optimized binary format.
- API: Exposes an HTTP API (default port 20001) for Raft management (voting, snapshots, membership).
Prerequisites:
- Rust
- Cargo
Build:
# Clone the repository
git clone https://github.com/SSL-ACTX/chronos-db.git
cd chronos-db
cargo build --releaseRun a Single Node:
# Runs on TCP 9000, Raft API 20001
./target/release/chronos --node-id 1
ChronosDB includes a custom SQL-like parser and a dedicated CLI client.
cargo run --bin chronos-cli
1. Insert Data Insert a 128-dim vector and a payload.
INSERT INTO VECTORS VALUES ([0.1, 0.5, ...], "user_session_data")
Note: Vectors are auto-padded if shorter than 128 dimensions.
2. Vector Search Find the nearest vectors using HNSW.
SELECT FROM VECTORS WHERE VECTOR NEAR [0.1, 0.5, ...] LIMIT 5
3. Retrieve by ID Get the latest version of a specific record.
GET 'uuid-string-here'
4. Update Data Update the payload of an existing record (creates a new version).
UPDATE VECTORS SET PAYLOAD="Updated Payload" WHERE ID='uuid-string-here'
5. Time Travel View the modification history of a record.
HISTORY 'uuid-string-here'
Returns a list of start/end timestamps and payloads for every version of the record.
6. Delete Logically delete a record (appends a tombstone).
DELETE FROM VECTORS WHERE ID='uuid-string-here'
ChronosDB uses a custom binary protocol for data transmission and HTTP for consensus coordination.
A helper script test_cluster.py is provided to bootstrap a 3-node cluster locally.
- Start Nodes: Nodes must be started with unique IDs and ports.
- Node 1: TCP 9000, Raft 20001
- Node 2: TCP 9001, Raft 20002
- Node 3: TCP 9002, Raft 20003
- Initialize Leader: Send an init request to Node 1.
curl -X POST http://127.0.0.1:20001/init
- Add Learners & Promote: Add Node 2 and Node 3 to the cluster via the HTTP API.
The cluster automatically creates snapshots when the log grows too large. This allows new nodes to catch up by downloading a compressed state rather than replaying the entire history.
- Trigger: Default policy creates a snapshot every 20 logs.
- Recovery: Nodes automatically restore HNSW graphs and Bloom filters from snapshots upon restart.
The project includes Python integration tests for verifying cluster consistency and snapshot logic.
-
Cluster Test:
python3 test_cluster.py -
Verifies Write -> Kill Leader -> Election -> Write -> Read Consistency.
-
Snapshot Test:
python3 test_snapshot.py -
Inserts data past the log limit, triggers a snapshot, adds a new node, and verifies the new node hydrated correctly from the binary snapshot.
Buggy
- CLI Integration:
python3 test_cli.py - Verifies the full SQL grammar, vector padding, and CRUD lifecycle.
ChronosDB automatically detects environment resources via a System Profile:
| Hardware | Mode | Behavior |
|---|---|---|
| 1 Core | Potato Mode | No fsync (Async durability), High Raft timeout (1s). |
| < 4 Cores | Standard | Strict durability, 500ms heartbeat. |
| Server | Server Mode | Strict durability, 250ms heartbeat, max worker threads. |
Author: Seuriin (SSL-ACTX)