Open-source vector database designed for simplicity and speed, with flexible deployment options.
Features
- Create and manage indexes
- Create and manage vector embeddings in indexes
- Search for embeddings in indexes based on distance
- Cosine distance
- Euclid distance
- Manhattan distance
APIs
- RESTful HTTP API using JSON
- Embedded use (Java)
Deployment
- Standalone Java application
- Docker
- Include as library (Java)
None
- Java 17 or later (tested using
17.0.9-zulu) - Git client
- Maven
- Docker
Clone this repository:
git clone https://github.com/tutikka/VectorDB.gitChange to cloned folder:
cd VectorDBClean, compile and package using Maven:
mvnw packageChange to created target directory:
cd targetStart the application:
java -jar vectordb-0.0.1-SNAPSHOT.jarOr build a Docker image and run it, for example:
docker build -t vectordb/vectordb .
docker run -p 8080:8080 vectordb/vectordbThe application will look for a configuration file named vectordb.properties in the root directory of the application during startup. If the file is not found, default values (shown in the example below) will be used.
#
# directory for data files (default = 'data')
#
data.directory = data
#
# maximum number of vectors per index (default = 65536)
#
data.max_vectors_per_index = 65536- Create index with
3 dimensionsand similarity based onmanhattan distance - Create entries into the index with random values as embeddings
- Search for the best matching entry based on given embedding
- Clean up and delete index
This example maps the positions of the planets in our solar system on 1.1.2025 to a 3D space using the sun as the origin, and then tests which planets are closest.
- Create index with
3dimensions (X, Y and Z coordinates) and similarity based oneuclid distance - Create entries to the index for each planet based on the position at 1.1.2025
- Search for the 3 closest planets to the sun
- Clean up and delete index
This example is closer to a real-world scenario, where we have documents that we want to index to perform queries based on similarity, and then summarize best results based on a user's question.
- Create index with
1536dimensions (from OpenAIada-002text embedding model) and similarity based oncosine distance - Create entries into the index by embedding each document using the OpenAI
ada-002text embedding model - Search for the best matching entry based on the user's question (embedded with the same model)
- Retrieve the original document identifier from the search results
- Use a chat completion model (OpenAI
gpt-5) to summarize the retrieved document based on the user's original question
Note! Make sure to add a .env file in the same directory with your OpenAI API Key
| Method | URI | Description |
|---|---|---|
POST |
/api/indexes |
Create new index |
GET |
/api/indexes |
List indexes |
GET |
/api/indexes/{id} |
Get index |
POST |
/api/indexes/{id}/entries |
Create new entry into index |
GET |
/api/indexes/{id}/entries |
List entries in index |
POST |
/api/indexes/{id}/search |
Submit search for entries in index |
Method
POST
URI
/api/indexes
Query Parameters
None
Request Body
{
"name": "test",
"dimensions": 1536,
"similarity": "cosine",
"optimization": "none"
}Response Status
HTTP 200: OkHTTP 400: Error creating index due to client inputHTTP 500: Error creating index due to server error
Response Body
{
"id": 1,
"name": "test",
"dimensions": 1536,
"similarity": "cosine",
"optimization": "none"
}Note! The server will populate the id field, which is used to refer to the index in other API methods.
Method
GET
URI
/api/indexes
Query Parameters
None
Request Body
None
Response Status
HTTP 200: Ok
Response Body
[
{
"id": 1,
"name": "test",
"dimensions": 1536,
"similarity": "cosine",
"optimization": "none"
}
]Method
GET
URI
/api/indexes/{id}
Query Parameters
id: The index identifier
Request Body
None
Response Status
HTTP 200: OkHTTP 404: Index not found
Response Body
{
"id": 1,
"name": "test",
"dimensions": 1536,
"similarity": "cosine",
"optimization": "none",
"extras": {
"_max_vectors": 65536,
"_num_vectors": 1,
"_size_on_disk": 1310728
}
}Method
POST
URI
/api/indexes/{id}/entries
Query Parameters
id: The index identifier
Request Body
{
"id": 1,
"embedding": [
0.1,
0.2,
0.3
]
}Response Status
HTTP 200: OkHTTP 400: Error creating entry due to client inputHTTP 404: Index not foundHTTP 500: Error creating entry due to server error
Response Body
{
"id": 1,
"embedding": [
0.2672612,
0.5345224,
0.8017837
]
}Method
GET
URI
/api/indexes/{id}/entries
Query Parameters
id: The index identifieroffset: The position in the index where to start retrieving entrieslimit: Maximun number of entries to retrieve
Request Body
None
Response Status
HTTP 200: OkHTTP 400: Error listing entries due to client inputHTTP 404: Index not foundHTTP 500: Error listing entries due to server error
Response Body
[
{
"id": 1,
"embedding": [
0.2672612,
0.5345224,
0.8017837
]
},
{
"id": 2,
"embedding": [
0.37139064,
0.557086,
0.7427813
]
},
{
"id": 3,
"embedding": [
0.4242641,
0.56568545,
0.70710677
]
}
]Method
POST
URI
/api/indexes/{id}/search
Query Parameters
id: The index identifier
Request Body
{
"embedding": [
0.1,
0.2,
0.3
],
"top": 3
}Response Status
HTTP 200: OkHTTP 400: Error searching entries due to client inputHTTP 404: Index not foundHTTP 500: Error searching entries due to server error
Response Body
{
"matches": [
{
"id": 1,
"distance": 5.9604644775390625E-8
},
{
"id": 2,
"distance": 0.007416725158691406
},
{
"id": 3,
"distance": 0.017292380332946777
}
],
"duration": 0,
"scanned": 3,
"total": 3,
"similarity": "cosine"
}VectorDB can be embedded into another Java application as a library. Follow the instructions above to download a binary release or build the JAR file yourself, and add it as a dependency to your project.
Note! This section still under development.
Example:
Random random = new Random();
// initialize properties
Properties properties = new Properties();
properties.setProperty("data.directory", "data");
properties.setProperty("data.max_vectors_per_index", "65536");
// new instance
VectorDB db = new VectorDB(properties);
// create index
Index index = new Index();
index.setName("test");
index.setDimensions(3);
index.setSimilarity(Index.SIMILARITY_COSINE_DISTANCE);
index.setOptimization(Index.OPTIMIZATION_NONE);
index = db.createIndex(index);
// create entries
for (int i = 0; i < 100; i++) {
Entry entry = new Entry();
entry.setId(i + 1);
entry.setEmbedding(new float[]{random.nextFloat(), random.nextFloat(), random.nextFloat()});
db.createEntry(index.getId(), entry);
}
// search for entries
Search search = new Search();
search.setTop(1);
search.setEmbedding(new float[]{random.nextFloat(), random.nextFloat(), random.nextFloat()});
SearchResult result = db.searchEntries(index.getId(), search);
Match match = result.getMatches().get(0);
System.out.printf("closest entry: id = %d, distance = %f%n", match.getId(), match.getDistance());
// delete index
db.deleteIndex(index.getId());