MultiRAG is a multi-source retrieval-augmented generation framework designed to solve the problems of multi-source knowledge fusion and reasoning. The framework achieves efficient and accurate multi-source knowledge fusion and reasoning through the construction of Multi-source Line Graphs (MLG), Multi-level Confidence Calculation (MCC), and Multi-source Knowledge Linear Graph Path (MKLGP) algorithm.
Multi-RAG/
├── data/ # Stores raw datasets and preprocessed data
├── src/ # Core code
│ ├── data_processing/ # Data preprocessing (format conversion, knowledge extraction)
│ ├── mka/ # Multi-source knowledge aggregation (MLG construction, subgraph matching)
│ ├── mcc/ # Multi-level confidence calculation
│ ├── mklgp/ # MKLGP algorithm implementation
│ └── evaluation/ # Evaluation metric calculation
├── experiments/ # Experiment configurations and result files
├── requirements.txt # Dependency configuration
└── README.md # Project description
- CPU: 8 cores or more (Intel i7/Ryzen 7 or higher)
- Memory: 32GB (64GB+ recommended to avoid OOM when processing large multi-source data)
- Storage: 100GB+ (for storing datasets, model weights, and preprocessed files)
- Python 3.10
- CUDA 11.6
- PyTorch 2.0.1
- Transformers 4.37.2
- Other dependencies listed in requirements.txt
# Download FusionDatasets
wget https://lunadong.com/fusiondatasets
unzip fusiondatasets -d data/raw/# Download HotpotQA validation set
wget https://raw.githubusercontent.com/hotpotqa/hotpotqa/master/hotpot_dev_distractor_v1.json
mkdir -p data/raw/hotpotqa
mv hotpot_dev_distractor_v1.json data/raw/hotpotqa/# Clone 2WikiMultiHopQA repository
git clone https://github.com/Alab-NII/2wikimultihop.git
mkdir -p data/raw/2wikimultihop
cp 2wikimultihop/data/dev.json data/raw/2wikimultihop/
rm -rf 2wikimultihoppython src/data_processing/format_converter.pypython src/data_processing/knowledge_extractor.pypython experiments/run_multirag.py- MLG Construction: Represents multi-source knowledge as line graphs, where nodes are triples and edges are shared entities
- Subgraph Matching: Matches homologous subgraphs (SVs) and isolated vertices (LVs) based on query entities
- Graph-level Confidence: Calculated based on node similarity within subgraphs
- Node-level Confidence: Calculated based on consistency scores and authority scores
- Subgraph Filtering: Filters low-confidence subgraphs and nodes
- Prompt Construction: Builds prompts based on filtered high-confidence subgraphs
- Answer Generation: Generates accurate answers using Llama3-8B-Instruct
- F1 Score: Measures the accuracy and completeness of answers
- Recall@K: Measures the proportion of correct answers among the top K answers
- Precision: Measures the proportion of correct parts in generated answers
- Recall: Measures the proportion of correct answers covered by generated answers
- Multi-source query datasets: F1 score ≥10% higher than baseline models
- HotpotQA: Recall@5 ≥62.7%, Precision ≥59.3%
- Efficiency: Query time one order of magnitude lower than baseline models
- Data Preparation: Ensure that datasets in the correct format are placed in the
data/raw/directory - Model Weights: Llama3-8B-Instruct model weights need to be downloaded
- VRAM Requirements: 16GB+ VRAM GPU is recommended for processing large-scale data
- Path Settings: Ensure all file paths are set correctly, especially data and model paths
If you use this project in your research, please cite the relevant papers.
This project is licensed under the MIT License.
For questions or suggestions, please contact the project maintainers.