A simple web application hosted on Hugging Face Spaces that utilizes a Small Language Model (Google T5 large) to answer basic questions related to diabetes based on a provided dataset.
Disclaimer: This application is for informational and demonstrational purposes only. It does NOT provide medical advice. Always consult with a qualified healthcare professional for any health concerns or before making any decisions related to your health or treatment.
You can try the live application hosted on Hugging Face Spaces: Link
- Provides answers to common questions about diabetes.
- Utilizes a fine-tuned Small Language Model (SLM) for question answering.
- Simple and intuitive web interface (likely using Gradio or Streamlit).
- Based on the
medical_qa.csvdataset.
This application follows a simple workflow:
- User Input: The user types a diabetes-related question into the web interface.
- Frontend (app.py): The UI framework (Gradio/Streamlit) running via
app.pycaptures the input. - Backend Logic (model_runner.py): The input is processed, potentially by helper functions in
model_runner.py. - SLM Inference: The processed query is fed into the fine-tuned Small Language Model.
- Response Generation: The SLM generates an answer based on its training data.
- Output Display: The generated answer is sent back to
app.pyand displayed to the user in the web interface.
To use the application:
- Navigate to the Hugging Face Space URL: LINK
- Wait for the application to load.
- Enter your question about diabetes in the provided text input field.
- Click the "Ask" button.
- The model's answer will appear in the output area.
This model was trained or fine-tuned using the medical_qa.csv dataset included in this repository.
- Source: https://www.nhs.uk/conditions/diabetes/
- Content: The dataset contains pairs of medical questions and answers, focused on diabetes.
- Format: CSV.
The core of this application is a Small Language Model (SLM).
- Type: [Fine-tuned Flan-T5-large.]
- Training: The model was fine-tuned using the
train_slm.pyscript on themedical_qa.csvdataset to specialize in answering diabetes-related questions.
- Language: Python
- ML Framework: PyTorch
- Core Libraries:
flask(For the web application/API)torch(The core PyTorch library)transformers(For interacting with Hugging Face models FLAN T5 LARGE)datasets(For data handling)sentencepiece(For text tokenization)accelerate(For simplifying multi-GPU/distributed training)- Platform: Hugging Face Spaces
- Version Control: Git / Git LFS
- Containerization (Optional): Docker (if you used
Dockerfile.dockerfile- remember to rename it toDockerfileif using the Docker SDK on Spaces)
Limitations:
- NOT Medical Advice: This tool cannot replace professional medical consultation.
- Knowledge Scope: Answers are limited to the information present in the
medical_qa.csvdataset and the SLM's training. It may not know about recent developments or highly specific cases. - Accuracy: While fine-tuned, the SLM may still generate incorrect or nonsensical answers (hallucinations).
- Basic Understanding: The model may struggle with very complex, nuanced, or poorly phrased questions.
Created by subratomandal - https://github.com/subratomandal