Skip to content

grupo-avispa/rag_ros

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rag_ros

ROS2 License

ROS 2 wrapper for Retrieval-Augmented Generation (RAG) systems, providing integration with LangChain and LangGraph for intelligent question-answering and document retrieval capabilities.

Overview

This package provides a ROS 2 service node for RAG (Retrieval-Augmented Generation) operations. It uses a Chroma vector store with HuggingFace embeddings for semantic search and document retrieval. The node exposes services for storing and retrieving documents, and automatically captures ROS 2 log messages from the /rosout topic for storage in the database.

Features:

  • Hybrid Search: Combines semantic search (vector similarity) with BM25 keyword search via EnsembleRetriever
  • Semantic search using Chroma vector store and HuggingFace embeddings
  • ROS 2 service interface for document retrieval and storage
  • Log message storage from /rosout topic with automatic metadata extraction
  • Flexible configuration via ROS 2 parameters
  • Support for metadata-rich document storage and filtering
  • Configurable embedding models and search strategies

Keywords: ROS2, RAG, LangChain, Vector Store, Semantic Search

Author: Alberto Tudela

The rag_ros package has been tested under ROS2 Jazzy on Ubuntu 24.04. This is research code, expect that it changes often and any fitness for a particular purpose is disclaimed.

Installation

Building from Source

Dependencies

Building

To build from source, clone the latest version from the repository into your colcon workspace and compile the package using:

cd colcon_workspace/src
git clone https://github.com/grupo-avispa/rag_ros.git
cd ../
rosdep install -i --from-path src --rosdistro jazzy -y
colcon build --symlink-install

Usage

Run the RAG service node with:

ros2 launch rag_ros default.launch.py

Nodes

rag_node

ROS 2 service node for RAG operations.

Services

  • retrieve_documents (llm_interactions_msgs/srv/RetrieveDocuments)

    Retrieve relevant documents from the vector database based on a query with optional filtering.

    Request:

    • query (string): The input query to retrieve relevant documents
    • k (int32): Number of documents to retrieve (default: 8)
    • filters (string): Optional metadata filters as JSON string. Supported filter keys: source, node_name, node_function, log_level

    Response:

    • status (string): Response status
    • total_results (int32): Total number of documents retrieved
    • results (Document[]): Array of retrieved documents with the following structure:
      • id (int32): Unique identifier for the document
      • content (string): Text content of the document
      • metadata (Metadata): Metadata associated with the document
        • source (string): Source or origin of the document
        • node_name (string): Name of the node that processed the document
        • node_function (string): Function of the node that processed the document
        • log_level (string): Log level of the message (DEBUG, INFO, WARN, ERROR, FATAL)
  • store_document (llm_interactions_msgs/srv/StoreDocument)

    Store a new document in the vector database.

    Request:

    • document (Document): Document to store with the following structure:
      • id (int32): Unique identifier for the document
      • content (string): Text content to store
      • metadata (Metadata): Metadata associated with the document
        • source (string): Source or origin of the document
        • node_name (string): Name of the node processing the document
        • node_function (string): Function of the node processing the document
        • log_level (string): Log level of the message (DEBUG, INFO, WARN, ERROR, FATAL)

    Response:

    • success (bool): Operation success status
    • message (string): Status message

Parameters

  • chroma_directory (string, default: "./chroma_db")

    Directory where Chroma vector database persistence data will be stored.

  • embedding_model (string, default: "sentence-transformers/all-MiniLM-L6-v2")

    HuggingFace embedding model to use for semantic search.

  • top_k (int, default: 8)

    Default number of documents to retrieve per query.

  • use_hybrid_search (bool, default: true)

    Enable hybrid search combining semantic search (vector similarity) with BM25 keyword-based search. When enabled, uses EnsembleRetriever with equal weights (50% semantic + 50% BM25) for more comprehensive document retrieval.

Example Usage

Retrieve Documents

# Basic retrieval
ros2 service call /retrieve_documents llm_interactions_msgs/srv/RetrieveDocuments "{query: 'machine learning', k: 5}"

# Retrieval with log level filter
ros2 service call /retrieve_documents llm_interactions_msgs/srv/RetrieveDocuments "{query: 'error', k: 5, filters: '{\"log_level\": \"ERROR\"}'}"

# Retrieval with multiple filters
ros2 service call /retrieve_documents llm_interactions_msgs/srv/RetrieveDocuments "{query: 'database', k: 5, filters: '{\"log_level\": \"ERROR\", \"node_name\": \"my_node\"}'}"

Store Document

ros2 service call /store_document llm_interactions_msgs/srv/StoreDocument "{document: {id: 1, content: 'Machine learning is a subset of artificial intelligence', metadata: {source: 'example.txt', node_name: 'example_node', node_function: 'process', log_level: 'INFO'}}}"

Configuration

You can customize the RAG service behavior by passing parameters to the launch file:

# Basic configuration with custom k and directory
ros2 launch rag_ros default.launch.py chroma_directory:=/path/to/chroma top_k:=10

# With custom embedding model
ros2 launch rag_ros default.launch.py embedding_model:='sentence-transformers/all-mpnet-base-v2'

# Enable/disable hybrid search
ros2 launch rag_ros default.launch.py use_hybrid_search:=true

# Full configuration example
ros2 launch rag_ros default.launch.py \
  chroma_directory:=/path/to/chroma \
  embedding_model:='sentence-transformers/all-MiniLM-L6-v2' \
  top_k:=5 \
  use_hybrid_search:=true

About

ROS 2 wrapper for Retrieval-Augmented Generation (RAG) systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages