We are excited to announce that our paper "Hierarchical Patch Compression for ColPali: Indexing and Retrieval with HPC-ColPali" has been accepted for presentation at the 17th International Conference on Knowledge Discovery and Information Retrieval (KDIR 2025) in Lisbon, Portugal!
Conference Details:
- Event: 17th International Conference on Knowledge Discovery and Information Retrieval (KDIR 2025)
- Location: Lisbon, Portugal
- Date: November 2025
- Proceedings: SCITEPRESS - Science and Technology Publications
- Indexing: SCOPUS, Google Scholar, DBLP, Semantic Scholar, CrossRef, and others
This repository implements HPC-ColPali, a novel hierarchical patch compression approach for document retrieval based on the ColQwen2.5 multilingual model. Our method introduces hierarchical patch-level embeddings and attention mechanisms to improve retrieval accuracy while maintaining computational efficiency.
Key Contributions:
- 🎯 Hierarchical Patch Compression: Novel approach to compress document patches while preserving semantic information
- 🌍 Multilingual Support: Based on ColQwen2.5-3B multilingual model for cross-lingual retrieval
- ⚡ Efficient Indexing: Dual-index strategy with HNSW and Product Quantization (PQ)
- 📊 Comprehensive Evaluation: Benchmark results on BEIR datasets
- Python 3.11+
- CUDA-compatible GPU (recommended)
- 8GB+ RAM
# Clone the repository
git clone https://github.com/your-username/HPC-ColPali.git
cd HPC-ColPali
# Install dependencies
pip install -r requirements.txt- Indexing Documents
# Place your PDF documents in src/data/
python src/ingest.py- Start Retrieval API
cd src
uvicorn query:app --host 0.0.0.0 --port 8000 --reload- Run Benchmarks
python src/benchmark.pyHPC-ColPali/
├── docs/
│ └── HPC_Copali_v2.pdf # Conference paper
├── src/
│ ├── data/ # Input PDF documents
│ ├── indexes/ # Generated indices
│ │ ├── faiss_hnsw.idx # HNSW index
│ │ ├── faiss_pq.idx # Product Quantization index
│ │ └── metadata.db # SQLite metadata
│ ├── ingest.py # Document indexing pipeline
│ ├── query.py # FastAPI retrieval service
│ ├── benchmark.py # BEIR benchmark evaluation
│ └── retriever_adapter.py # Retrieval adapter
├── requirements.txt # Python dependencies
└── README.md # This file
Our approach consists of three main components:
- Document Processing: PDF parsing and chunking with overlap
- Embedding Generation: Hierarchical patch-level embeddings using ColQwen2.5
- Indexing Strategy: Dual-index approach with HNSW and PQ for efficiency
- Base Model:
tsystems/colqwen2.5-3b-multilingual-v1.0 - Chunk Size: 500 characters with 50 character overlap
- Batch Size: 8 (configurable)
- Index Types: HNSW (32 neighbors) + Product Quantization (8 subspaces, 8 bits)
The FastAPI service provides:
POST /search: Document retrieval with configurable parametersquery: Search query texttop_k: Number of results (default: 5)prune_ratio: Attention pruning ratio (default: 1.0)
Results will be updated after conference publication
- Multilingual Retrieval: Support for multiple languages through ColQwen2.5
- Attention-Aware: Hierarchical attention mechanisms for better semantic understanding
- Scalable Indexing: Efficient dual-index strategy for large-scale datasets
- RESTful API: Easy integration with existing systems
- Comprehensive Evaluation: BEIR benchmark integration
If you use this code in your research, please cite our paper:
arXiv Version:
@misc{bach2025hierarchicalpatchcompressioncolpali,
title={Hierarchical Patch Compression for ColPali: Efficient Multi-Vector Document Retrieval with Dynamic Pruning and Quantization},
author={Duong Bach},
year={2025},
eprint={2506.21601},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2506.21601},
}Conference Version (KDIR 2025):
Not yet
This project is licensed under the MIT License - see the LICENSE file for details.
- ColQwen2.5 model by T-Systems
- FAISS library for efficient similarity search
- BEIR benchmark for evaluation framework
- KDIR 2025 conference committee
For questions or collaboration opportunities, please contact us or open an issue on GitHub.