Citations should guide readers to exact evidence, not just point to entire papers. Research today suffers from widespread citation inaccuracies and the challenge of locating specific supporting content within referenced documents.
SemanticCite transforms citation verification by analysing complete source documents and providing nuanced classification through four categories: Supported, Partially Supported, Unsupported, and Uncertain. Beyond simple validation, the system delivers detailed reasoning, confidence scores, and evidence reference snippets that show researchers exactly how their claims connect to the supporting literature.
- Full-text document analysis (not just abstracts)
- 4-class classification: Supported, Partially Supported, Unsupported, Uncertain
- Evidence-based reasoning with relevant text snippets
- Fine-tuned Qwen3 models (1.7B & 4B parameters)
- Performance comparable to GPT-4 with 100x fewer resources
- Local deployment option for privacy
- Dense vector search + sparse BM25 matching
- Neural reranking with FlashRank
- Optimized for accuracy and efficiency
- Web interface (Streamlit)
- Python API
- Local/cloud deployment
- Enterprise licensing available
# Clone and setup
git clone https://github.com/your-org/SemanticCite
cd SemanticCite
conda env create -f environment.yaml
conda activate cite
# Set API keys (copy from template)
cp .env.example .env
# Edit .env with your API keys (see Environment Setup below)
# Run web interface
streamlit run src/app.py
# Visit http://localhost:8501pip install semanticciteCreate a .env file from the template and add your API keys:
cp .env.example .envEdit .env with the providers you plan to use:
# For OpenAI models (LLM and/or embeddings)
OPENAI_API_KEY=sk-...
# For Claude models
ANTHROPIC_API_KEY=sk-ant-...
# For Gemini models
GEMINI_API_KEY=AIza...
# Optional: For custom endpoints
NVIDIA_API_KEY=nvapi-...Note: Only the API keys for your chosen providers are required. For fully local deployment with Ollama, no API keys are needed.
On first run, SemanticCite will automatically download required models. Download times vary based on your configuration:
Cloud Providers (OpenAI/Claude/Gemini):
- FlashRank reranking model: ~150MB (~30-60 seconds)
- Local embeddings (if selected): ~400MB-1GB (~1-3 minutes)
Local Deployment (Ollama):
- FlashRank reranking model: ~150MB (~30-60 seconds)
- SemanticCite-Refiner-Qwen3-1B: ~1GB (~2-4 minutes)
- SemanticCite-Checker-Qwen3-4B: ~2.5GB (~4-8 minutes)
- Local embeddings: ~400MB-1GB (~1-3 minutes)
Total first-run setup time: 2-15 minutes depending on configuration
After initial setup, typical analysis times:
- Cloud providers: 5-15 seconds per citation
- Local models (Ollama): 10-30 seconds per citation (first analysis may take longer as models load into memory)
- Document Processing (2-5s): Splits reference into chunks, creates vector embeddings
- Claim Extraction (1-3s): Extracts core claim from citation text
- Retrieval (1-2s): Finds relevant chunks using hybrid search (BM25 + dense vectors)
- Reranking (1-2s): Reorders chunks by semantic relevance
- Classification (2-5s): Analyses support level and generates reasoning
After completing installation, launch the web interface:
streamlit run src/app.py
# Visit http://localhost:8501Features:
- π Upload files (PDF, TXT, Markdown) or download from URLs
- βοΈ Choose LLM providers: OpenAI, Claude, Gemini, or Local (Ollama)
- π Multiple embedding options: Local SentenceTransformers, OpenAI, Custom endpoint
- π Optional metadata input for enhanced context
- π Interactive results with reasoning and evidence snippets in collapsible expanders
- π₯ Export results to Markdown format for documentation
# Basic usage
from src.citecheck import ReferenceChecker
# Initialize with default OpenAI models
checker = ReferenceChecker()
# Or configure specific providers
checker = ReferenceChecker(
llm_provider="openai",
llm_config={
"model": "gpt-4.1-mini",
"temperature": 0.7
},
embedding_provider="local",
embedding_config={
"model_name": "all-mpnet-base-v2"
}
)
# Check a citation
result = checker.check_citation(
citation="Your citation text here",
reference_text="Reference document text",
metadata="Optional document metadata"
)
print(f"Classification: {result['classification']}")
print(f"Confidence: {result['metadata']['confidence_score']}")
print(f"Reasoning: {result['reasoning']}")# Basic CLI usage
python src/citecheck.py \
--citation "Over the period 2004β2012, a decline in the AMOC has been observed" \
--reference "path/to/reference.pdf"
# Interactive mode (prompts for inputs)
python src/citecheck.pyPowered by LiteLLM supporting 100+ AI providers including OpenAI, Claude, Gemini, and local endpoints via Ollama.
- Local: SentenceTransformers models (
all-mpnet-base-v2,Qwen/Qwen3-Embedding-0.6B) - OpenAI:
text-embedding-3-small,text-embedding-ada-002 - Custom Endpoint: Any OpenAI-compatible embedding API
Need to verify entire documents automatically? Visit semanticcite.com for tailored solutions:
- Complete Citation System: Automatic document processing with citation extraction and verification of all references in one workflow
- Batch Processing: Verify hundreds or thousands of citations efficiently with automated pipelines
- API Integration: RESTful API for seamless integration into editorial and publishing workflows
- On-premise Deployment: Secure, private installation with custom model training on your domain
- Hybrid Retrieval: BM25 + Dense Vector Search
- Reranking: FlashRank neural reranking
- Classification: Fine-tuned Qwen3 models
- Frontend: Streamlit web interface
- Storage: ChromaDB vector database
For cloud providers (OpenAI, Claude, Gemini), a single model handles both claim extraction and classification:
- Model: Provider-specific (e.g., GPT-4, Claude Sonnet, Gemini Flash)
- Embedding: Local SentenceTransformers or OpenAI embeddings
- Supported Formats: PDF, TXT, Markdown
For local deployment with Ollama, two specialized models work together:
- Preprocessing Model:
SemanticCite-Refiner-Qwen3-1B(extracts core claims from citations) - Classification Model:
SemanticCite-Checker-Qwen3-4B(analyses support level) - Embedding Model: Local SentenceTransformers (e.g.,
Qwen/Qwen3-Embedding-0.6B) - Advantage: Optimized models with better performance and lower resource usage
The SemanticCite models are available on Hugging Face and can be used with Ollama:
Models:
- SemanticCite-Refiner-Qwen3-1B - Claim extraction (1.7B parameters)
- SemanticCite-Checker-Qwen3-4B - Citation verification (4B parameters)
Installation:
- Install Ollama from ollama.ai
- Download the models:
ollama pull sebsigma/semanticcite-refiner-qwen3-1b ollama pull sebsigma/semanticcite-checker-qwen3-4b
- Verify installation:
ollama list
- In the Streamlit interface, select "Local Ollama" as your LLM provider
# Run test suite
python tests/run_tests.py
# Test specific functionality
python tests/test_citecheck.pyProblem: Analysis fails with "Connection timed out after 120.0 seconds" when using local Ollama models.
Solutions:
- Increase timeout in code: The default timeout is 120 seconds. For slower systems, this may be insufficient.
- Check Ollama is running:
curl http://localhost:11434/api/tags
- Verify models are installed:
ollama list
- Test model loading time:
python tests/test_ollama_diagnostics.py
Problem: Analysis fails with "API key required" error.
Solution: Ensure you've:
- Created a
.envfile from.env.example - Added the correct API key for your selected provider
- Restarted the Streamlit app after adding keys
Problem: First run fails during FlashRank model download.
Solutions:
- Check internet connection
- Retry - the download will resume from where it left off
- Manually download and cache the model:
from flashrank import Ranker ranker = Ranker(model_name="ms-marco-MultiBERT-L-12")
Problem: PDF file upload returns an error.
Solutions:
- Verify the PDF is not corrupted or password-protected
- Try converting to TXT format first
- Check file size is reasonable (<50MB recommended)
Problem: System crashes or becomes unresponsive during analysis.
Solutions:
- Close other applications to free up RAM
- Use cloud providers instead of local models
- Reduce chunk size in processing parameters
- Process one citation at a time
Problem: Analysis completes but shows no evidence chunks.
This is normal behaviour when:
- The citation is not well-supported by the reference document
- The reference document doesn't contain relevant information
- The citation refers to a different section/paper
To investigate:
- Check if you uploaded the correct reference document
- Verify the citation actually refers to this paper
- Try adjusting the relevance threshold (advanced configuration)
If you encounter issues not covered here:
- Check GitHub Issues
- Review logs in the
logs/directory - Open a new issue with:
- Error message
- Configuration details (LLM/embedding providers)
- Steps to reproduce
This project is licensed under the MIT License - see the LICENSE file for details.
If you use SemanticCite in your research, please cite our paper:
@article{semanticcite2025,
title={SemanticCite: Citation Verification with AI-Powered Full-Text Analysis and Evidence-Based Reasoning},
author={Sebastian Haan},
journal={ArXiv Preprint},
year={2025},
url={https://arxiv.org/abs/2511.16198}
}- Built with LangChain, LiteLLM, and Streamlit
- Models fine-tuned on Qwen3 architecture using Unsloth
- Vector search powered by ChromaDB
- Neural reranking via FlashRank
- Supported by the Sydney Informatics Hub at the University of Sydney
SemanticCite - Enhancing research quality through AI-powered citation verification and insight

