A scalable, modular, and memory-efficient text summarization system using facebook/bart-base, built with full MLOps-style pipelines for data ingestion, validation, transformation, training, evaluation, and deployment.
⚙️ Designed for research and real-world deployment with CLI, Streamlit frontend, and FastAPI backend support.
- Model:
facebook/bart-basevia Hugging Face - Dataset:
cnn_dailymailv3.0.0 (abisee/cnn_dailymail) - Training Data Used: 50K samples (scalable to full 287K)
- Validation: 3K (notebook) / 1K (pipeline) (out of 13.4K)
- Test: 3K (notebook) / 1K (pipeline) (out of 11.5k)
- Evaluation: ROUGE-L:
20.67, ROUGE-1:25.21, ROUGE-2:12.28 - Frameworks: PyTorch, Hugging Face Transformers + Accelerate
- Deployment: Dockerized, Streamlit UI, FastAPI backend
Implemented using modular stages, YAML-config driven execution
- Data Ingestion →
cnn_dailymailload and split - Data Validation → schema check, format verification
- Data Transformation → tokenization and formatting
- Model Training → with
Seq2SeqTrainer - Model Evaluation → on custom-held-out test set
- Prediction → CLI or UI-based summarization
training_args = Seq2SeqTrainingArguments(
output_dir="./results",
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
fp16=True, # Mixed-precision
num_train_epochs=3,
predict_with_generate=True
)Trained on 4GB RTX 3050 using Hugging Face Accelerate for speed and efficiency.
Train Loss: 0.9688 | Global Steps: 18,750
Test Loss: 0.9473 | ROUGE-1: 25.21 | ROUGE-2: 12.28 | ROUGE-L: 20.67
Run training or prediction:
python main.pystreamlit run frontend_app.pyuvicorn backend_app:app --reloadBuild and run container:
docker build -t summarizer-app .
docker run -p 8000:8000 summarizer-appThis is a solo project but open to contributors. Feel free to raise issues, suggest features, or submit PRs 🚀
MIT License
- Add support for
peft/ LoRA fine-tuning - Implement memory-efficient attention (FlashAttention)
- Deploy to Hugging Face Spaces or Streamlit Cloud
- Add multi-lingual summarization support