Skip to content

Releases: TIC-13/SmolRag

v1.1.0

21 Mar 18:47
a98d168

Choose a tag to compare

SmolRag

Changelog

  • Disabled chat history to improve prefill time and prevent exceeding the max token limit;
  • Added support for the Granite 30M embedding model in the application.

Two apk files are provided: one using bge-small-en-v1.5 and the other using granite-30m-english as the embedding model.
The version with Granite demonstrated better performance.

We recommend using the LLM "Qwen2.5 1.5B Q8 Instructions".

v1.0.0

27 Feb 14:33
a44caee

Choose a tag to compare

SmolRag

This project is an evolution of SmolChat, running Retrieval-Augmented Generation (RAG) techniques locally to enhance the performance of LLMs in specific subject scenarios. It is ideal for situations where a specialized model is needed but unavailable, and fine-tuning isn't feasible, providing generic models with relevant in-context information.

SmolChat Changelog

  • Support of RAG for LLM response added
  • Support of reranking for LLM response added
  • Customization of system message disabled