Skip to content

CodingWithRoshan/VaidyaVerse

Repository files navigation

🩺 VaidyaVerse - AI Powered Bilingual Doctor Assistant

VaidyaVerse is an AI-powered medical assistant that analyzes voice input and medical images to generate doctor-like responses in Hindi and English.

The system integrates Speech-to-Text, Large Language Models, and Text-to-Speech to simulate a conversational AI doctor experience.


🚀 Features

  • 🎤 Speech-to-Text using Groq Whisper
  • 🧠 Medical reasoning using Groq LLM
  • 🖼 Image-based medical query support
  • 🌍 Supports Hindi (Devanagari) and English
  • 🔁 Automatic Urdu-to-Hindi script normalization
  • 🔊 Text-to-Speech using:
    • gTTS (Google Text-to-Speech)
    • ElevenLabs (Optional High-Quality Voice)
  • ⚡ Interactive UI built with Gradio

🛠 Tech Stack

  • Python
  • Groq API (Whisper + LLM)
  • ElevenLabs API (TTS)
  • gTTS
  • Gradio
  • Indic Transliteration
  • dotenv

📦 Installation

Clone the repository:

git clone https://github.com/YOUR_USERNAME/VaidyaVerse.git
cd VaidyaVerse

Install dependencies:

pip install -r requirements.txt

Create a .env file in the root directory:

GROQ_API_KEY=your_groq_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key

Run the application:

python gradio_app.py

🌍 Language Support

Supported Languages:

  • Hindi (Devanagari script)
  • English

Behavior:

  • If Hindi speech is detected → Hindi response + Hindi voice
  • If English speech is detected → English response + English voice
  • Urdu script is automatically normalized to Hindi

No other languages are supported.


🔊 Text-to-Speech Options

1️⃣ gTTS

  • Lightweight
  • Free
  • Suitable for basic Hindi and English voice output

2️⃣ ElevenLabs (Optional)

  • High-quality natural voice
  • Better pronunciation
  • Requires API key
  • Internet connection required

⚠ Limitations

  • This project is for educational and demonstration purposes only.
  • It is NOT a substitute for professional medical advice.
  • AI-generated responses may be incorrect or incomplete.
  • No real medical validation layer is implemented.
  • Image analysis is prompt-based and not a true computer vision diagnosis system.
  • Requires active internet connection.
  • Depends on Groq API uptime and rate limits.
  • No patient data storage or authentication system.
  • Not suitable for emergency or critical medical decisions.

🔒 Security Notice

  • Never upload your .env file.
  • Keep API keys private.
  • Do not expose keys in commits.
  • Use environment variables in production deployment.

📌 Future Improvements

  • Add language selection toggle in UI
  • Add medical safety guardrails
  • Add disclaimer popup in UI
  • Add conversation memory
  • Deploy to HuggingFace Spaces
  • Add authentication system
  • Integrate real computer vision model
  • Add structured medical output format

📄 License

This project is open-source and intended for educational use only.


👨‍💻 Author

Developed as an AI bilingual medical assistant project using Groq and ElevenLabs integration.

About

AI-powered bilingual doctor assistant using Groq (Whisper + LLM) and ElevenLabs for Hindi & English medical voice responses.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages