Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 3 additions & 13 deletions .github/workflows/ghpages-mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
name: Deploy MKDocs site to Pages
name: Build MKDocs Site

on:
push:
pull_request:
branches:
- main
- dev
paths:
- .github/workflows/ghpages-mkdocs.yml
- docs/**
Expand Down Expand Up @@ -63,14 +64,3 @@ jobs:
uses: actions/upload-pages-artifact@v3
with:
path: ./docs/_site

deploy:
runs-on: ubuntu-latest
needs: build
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
76 changes: 76 additions & 0 deletions .github/workflows/mkdocs-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Deploy MKDocs Site to Pages

on:
push:
branches:
- main
paths:
- .github/workflows/ghpages-mkdocs.yml
- docs/**
- mkdocs.yml
workflow_dispatch:

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0 # Fetch all history for all branches and tags, not just the default branch.
# This is needed to ensure that the commit SHA is available for the deployment.
# See
sparse-checkout: |
docs
mkdocs.yml
.github/workflows/ghpages-mkdocs.yml

- name: Setup pages
id: pages
uses: actions/configure-pages@v5


- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x' # Use the latest version of Python 3

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r ./docs/requirements.txt

- name: Build documentation
# Outputs to the './_site' directory by default
run: |
mkdocs build --site-dir ./_site --config-file ./docs/mkdocs.yml

- name: Upload artifact
# Automatically upload an artifact from the './_site' directory by default
uses: actions/upload-pages-artifact@v3
with:
path: ./docs/_site

deploy:
runs-on: ubuntu-latest
needs: build
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v2.1.1

Add:

- Add optional `filter` parameter to `.Search` and `.SearchAsync` methods that is of type `Func<TMetadata, bool>` that is called for each text item in the database for more advanced filtering prior to performing vector similarity search and returning results. If undefined or `null` it's ignored.

## v2.1.0

Add:
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/concepts/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Concepts
description: Understand the core concepts behind SharpVector, from vector similarity to embedding strategies and in-memory architecture.
---
# Concepts
# :octicons-light-bulb-24: Concepts

## What is a Vector Database?

Expand Down
5 changes: 4 additions & 1 deletion docs/docs/embeddings/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# Embeddings
---
title: Embeddings
---
# :fontawesome-solid-square-binary: Embeddings

`Build5Nines.SharpVector` includes the following support for using AI Models to generate the text embeddings for the vector database instead of generating them locally. The use of an AI Embeddings model can greatly increase the quality of the semantic search.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/get-started/data-management/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Data Management

---
# Data Management
# :material-database-edit-outline: Data Management

Since `Build5Nines.SharpVector` is a database, it also has data management methods available. These methods enable you to add, remove, and update the text documents that are vectorized and indexed within the semantic database.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/get-started/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Get Started
description: Get up and running with SharpVector in minutes. Learn how to install, initialize, and begin storing and searching vectorized text data.
---
# Get Started
# :octicons-rocket-24: Get Started

It's really easy to get started with using `Build5Nines.SharpVector`. Simply follow the below steps.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/get-started/metadata/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Metadata
---
# Metadata
# :material-database-cog-outline: Metadata

The `Build5Nines.SharpVector` vector database enables semantic search for `Text` that is stored in the database. Being able to semantically search text is an extremely useful way to lookup more information related to the text. For this purpose, `Metadata` is stored alongside the `Text` within the vector database. This way, when `Text` is found when performing a semantic search, then the matching `Metadata` is also retrieved.

Expand Down
106 changes: 106 additions & 0 deletions docs/docs/get-started/search/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: Semantic Search
---
# :material-file-search: Semantic Search

Once text items and their associated metadata have been added to the vector database, the database can be used for semantic search to find matching text items for a given query.

The `BasicMemoryVectorDatabase` and `MemoryVectorDatabase<>` classes both contain `.Search` and `.SearchAsync` methods that can be used to perform semantic search on the database:

=== "Sync"

```csharp
var query = "some text to search";
var results = vdb.Search(query);
```

=== "Async"

```csharp
var query = "some text to search";
var results = await vdb.SearchAsync(query);
```

## Metadata Filters

The `.Search` and `.SearchAsync` methods also include the ability to pre-filter the search results based on a boolean evaluation of the `Metadata` for the text item. This check is run before the vector similarity search is performed, and can help increase search performance on large datasets.

Here are a couple examples of using the `filter` parameter to perform `Metadata` filtering when performing semantic searches:

=== "Sync"

```csharp
var vdb = new BasicMemoryVectorDatabase();

// load text and metadata into database

var query = "some text to search";
var results = vdb.Search(
query,
filter: (metadata) => {
// perform some operation to check metadata
// return true or false
return metadata.Contains("B59");
}
);
```

=== "Async"

```csharp
var vdb = new MemoryVectorDatabase<Person>();

// load text and metadata into database

var query = "some text to search";
var results = vdb.SearchAsync(
query,
filter: async (metadata) => {
// perform some operation to check metadata
// return true or false
return metadata.LastName == "Pietschmann";
}
);
```

!!! info "OpenAI and Ollama Support"

This functionality works the same with both [:simple-openai: OpenAI and :simple-ollama: Ollama supported vector databases](../../embeddings/index.md) too.

## Paging

The `.Search` and `.SearchAsync` methods also include the ability to perform paging on the text items returned from the semantic search. This is performed after the similarity search and the `filter` has been applied to the search results. This is done using the optional `pageCount` and `pageIndex` paramters.

Here are a couple examples of using the `pageCount` and `pageIndex` parameters to perform paging with the semantic search results:

=== "Sync"

```csharp
var vdb = new BasicMemoryVectorDatabase();

// load text and metadata into database

var query = "some text to search";
var results = vdb.Search(
query,
pageIndex: 0, // return first page of results (default: 0)
pageCount: 6 // limit length of this page of results (default: unlimited)
);
```

=== "Async"

```csharp
var vdb = new MemoryVectorDatabase<Person>();

// load text and metadata into database

var query = "some text to search";
var results = vdb.SearchAsync(
query,
pageIndex: 0, // return first page of results (default: 0)
pageCount: 6 // limit length of this page of results (default: unlimited)
);
```

The `pageIndex` and `pageIndex` paramters are optional, and can be used individually or together.
4 changes: 3 additions & 1 deletion docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Discover
description: The lightweight, in-memory, semantic search, text vector database for .NET that powers intelligent search and recommendation features.
---
# Discover Build5Nines.SharpVector
# :fontawesome-regular-compass: Discover Build5Nines.SharpVector

**Build5Nines.SharpVector** is the lightweight, in-memory, semantic search, text vector database built for .NET applications. It enables fast and flexible vector-based similarity search for text data — ideal for search engines, recommendation systems, semantic analysis, and AI-enhanced features.

Expand All @@ -27,6 +27,8 @@ Vector databases are used with Semantic Search and [Generative AI](https://build

While there are lots of large databases that can be used to build Vector Databases (like Azure CosmosDB, PostgreSQL w/ pgvector, Azure AI Search, Elasticsearch, and more), there are not many options for a lightweight vector database that can be embedded into any .NET application. Build5Nines SharpVector is the lightweight in-memory Text Vector Database for use in any .NET application that you're looking for!

<!-- [:material-file-pdf-box: Download documentation](/build5nines-sharpvecto.pdf){ .md-button .md-button--secondary } -->

---

> "For the in-memory vector database, we're using Build5Nines.SharpVector, an excellent open-source project by Chris Pietschmann. SharpVector makes it easy to store and retrieve vectorized data, making it an ideal choice for our sample RAG implementation."
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/license/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: Review the MIT license terms for using and contributing to the Shar
date: 2025-04-13
---

# License
# :octicons-file-badge-24: License

```text
MIT License
Expand Down
5 changes: 4 additions & 1 deletion docs/docs/persistence/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# Data Persistence
---
title: Data Persistence
---
# :octicons-file-24: Data Persistence

The `Build5Nines.SharpVector` library provides easy-to-use methods for saving a memory-based vector database to a file or stream and loading it again later. This is particularly useful for caching indexed content between runs, deploying pre-built vector stores, or shipping databases with your application.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/resources/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Resources
description: Dive deeper with curated resources, links, and tools for working with vector databases, semantic search, and SharpVector.
---

# Resources
# :octicons-link-24: Resources

## Tutorials

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/samples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Samples
description: Explore real-world code samples to see SharpVector in action. Build search engines, intelligent note apps, and more.
---

# Samples
# :material-run-fast: Samples

## Sample Console App

Expand Down
7 changes: 5 additions & 2 deletions docs/docs/text-chunking/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Text Chunking
description: Learn how to break large documents into smaller chunks to improve vector search relevance and optimize embedding performance.
---
# Text Chunking
# :material-script-text: Text Chunking

**Text chunking** is the process of breaking up large documents into smaller segments ("chunks") before embedding and storing them in a vector database. This allows for more accurate semantic search and improves performance in applications that deal with large bodies of text.

Expand Down Expand Up @@ -124,7 +124,10 @@ The `TextDataLoader<TKey, TValue>` class can be used to load documents into the
}
```

> 🧠 Tip: Use chunking method and size that best aligns with your content type and retrieval goals.
!!! info "Optimization Tip"
Use chunking method and size that best aligns with your content type and retrieval goals. For larger documents, text chunking will be required to get the best semantic search results returned from the vector database.

Some experimentation on your data set may be required to find the text chunking strategy that works best for your solution.

---

Expand Down
23 changes: 10 additions & 13 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ markdown_extensions:
- md_in_html
- toc:
permalink: true
toc_depth: 3
toc_depth: 2
- pymdownx.critic
- pymdownx.caret
- pymdownx.keys
Expand Down Expand Up @@ -98,6 +98,11 @@ plugins:
- git-committers: # mkdocs-git-committers-plugin-2 plugin to show contributors on footer of
repository: Build5Nines/SharpVector
branch: main
# - with-pdf:
# cover_subtitle: "Lightweight, In-memory, Semantic Search, Text Vector Database to embed in any .NET Application"
# author: Build5Nines LLC
# output_path: build5nines-sharpvector.pdf


extra:
# consent:
Expand Down Expand Up @@ -137,18 +142,10 @@ nav:
- Prerequisites: get-started/#prerequisites
- Install Nuget Package: get-started/#install-nuget-package
- Basic Example: get-started/#basic-example
- Metadata:
- get-started/metadata/index.md
- Adding Metadata: get-started/metadata/#adding-metadata
- JSON and String Metadata: get-started/metadata/#json-and-string-metadata
- Custom Metadata Type: get-started/metadata/#custom-metadata-type

- Data Management:
- get-started/data-management/index.md
- Get Text Item Id: get-started/data-management/#get-text-item-id
- Get Item By Id: get-started/data-management/#get
- Update Item: get-started/data-management/#update
- Delete Item: get-started/data-management/#delete
- Metadata: get-started/metadata/index.md
- Semantic search: get-started/search/index.md
- Data Management: get-started/data-management/index.md

- Concepts:
- concepts/index.md
- What is a Vector Database?: concepts/#what-is-a-vector-database
Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ pymdown-extensions
markdown-include
mkdocs-git-committers-plugin-2
mkdocs-git-revision-date-localized-plugin
mkdocs-with-pdf
Loading
Loading