A local text embedding service using HuggingFace's Text Embeddings Inference server.
- Docker
jq(for the CLI tool)curl
# Start the service and wait until ready
./setup.sh
# Generate an embedding
echo "Hello world" | ./embed.shSet environment variables before starting the service:
| Variable | Default | Description |
|---|---|---|
EMBEDDING_MODEL_ID |
unsloth/embeddinggemma-300m |
HuggingFace model to use |
EMBEDDING_DIMENSION |
768 |
Embedding vector dimension |
Example with a different model:
EMBEDDING_MODEL_ID=BAAI/bge-small-en-v1.5 ./setup.sh# Single string
echo "text to embed" | ./embed.sh
# From a file
cat document.txt | ./embed.shThe service exposes an OpenAI-compatible API on port 8080:
curl http://localhost:8080/v1/embeddings \
-H 'Content-Type: application/json' \
-d '{"model":"unsloth/embeddinggemma-300m","input":"text to embed"}'docker compose down