GitHub - TIC-13/SmolRag: Running any GGUF SLMs/LLMs locally, on-device in Android

SmolRag - SmolChat with RAG

This project is an evolution of SmolChat, running Retrieval-Augmented Generation (RAG) techniques locally to enhance the performance of LLMs in specific subject scenarios. It is ideal for situations where a specialized model is needed but unavailable, and fine-tuning isn't feasible, providing generic models with relevant in-context information.

SmolChat Changelog:

Support of RAG for LLM response added
Support of reranking for LLM response added
Customization of system message disabled - On-Device Inference of SLMs in Android

SmolChat Project Goals

Provide a usable user interface to interact with local SLMs (small language models) locally, on-device
Allow users to add/remove SLMs (GGUF models) and modify their system prompts or inference parameters (temperature, min-p)
Allow users to create specific-downstream tasks quickly and use SLMs to generate responses
Simple, easy to understand, extensible codebase

Setup

Clone the repository with its submodule originating from llama.cpp,

git clone https://github.com/TIC-13/SmolRag.git
cd SmolRag
git submodule update --init --recursive

Android Studio starts building the project automatically. If not, select Build > Rebuild Project to start a project build.
After a successful project build, connect an Android device to your system. Once connected, the name of the device must be visible in top menu-bar in Android Studio.

Working

The application uses llama.cpp to load and execute GGUF models. As llama.cpp is written in pure C/C++, it is easy to compile on Android-based targets using the NDK.
The smollm module uses a llm_inference.cpp class which interacts with llama.cpp's C-style API to execute the GGUF model and a JNI binding smollm.cpp. Check the C++ source files here. On the Kotlin side, the SmolLM class provides the required methods to interact with the JNI (C++ side) bindings.
The app module contains the application logic and UI code. Whenever a new chat is opened, the app instantiates the SmolLM class and provides it the model file-path which is stored by the LLMModel entity in the ObjectBox. Next, the app adds messages with role user and system to the chat by retrieving them from the database and using LLMInference::addChatMessage.
For tasks, the messages are not persisted, and we inform to LLMInference by passing _storeChats=false to LLMInference::loadModel.

Technologies

ggerganov/llama.cpp is a pure C/C++ framework to execute machine learning models on multiple execution backends. It provides a primitive C-style API to interact with LLMs converted to the GGUF format native to ggml/llama.cpp. The app uses JNI bindings to interact with a small class smollm. cpp which uses llama.cpp to load and execute GGUF models.
ObjectBox is a on-device, high-performance NoSQL database with bindings available in multiple languages. The app uses ObjectBox to store the model, chat and message metadata.
noties/Markwon is a markdown rendering library for Android. The app uses Markwon and Prism4j (for code syntax highlighting) to render Markdown responses from the SLMs.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.github		.github
app		app
docs		docs
gradle		gradle
hf-model-hub-api		hf-model-hub-api
llama.cpp @ 564804b		llama.cpp @ 564804b
rag-android @ 2cb4eec		rag-android @ 2cb4eec
resources/app_screenshots		resources/app_screenshots
smollm		smollm
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SmolRag - SmolChat with RAG

SmolChat Changelog:

SmolChat Project Goals

Setup

Working

Technologies

About

Uh oh!

Releases 2

Packages

Languages

License

TIC-13/SmolRag

Folders and files

Latest commit

History

Repository files navigation

SmolRag - SmolChat with RAG

SmolChat Changelog:

SmolChat Project Goals

Setup

Working

Technologies

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages