VaidyaVerse is an AI-powered medical assistant that analyzes voice input and medical images to generate doctor-like responses in Hindi and English.
The system integrates Speech-to-Text, Large Language Models, and Text-to-Speech to simulate a conversational AI doctor experience.
- 🎤 Speech-to-Text using Groq Whisper
- 🧠 Medical reasoning using Groq LLM
- 🖼 Image-based medical query support
- 🌍 Supports Hindi (Devanagari) and English
- 🔁 Automatic Urdu-to-Hindi script normalization
- 🔊 Text-to-Speech using:
- gTTS (Google Text-to-Speech)
- ElevenLabs (Optional High-Quality Voice)
- ⚡ Interactive UI built with Gradio
- Python
- Groq API (Whisper + LLM)
- ElevenLabs API (TTS)
- gTTS
- Gradio
- Indic Transliteration
- dotenv
Clone the repository:
git clone https://github.com/YOUR_USERNAME/VaidyaVerse.git
cd VaidyaVerseInstall dependencies:
pip install -r requirements.txtCreate a .env file in the root directory:
GROQ_API_KEY=your_groq_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
Run the application:
python gradio_app.pySupported Languages:
- Hindi (Devanagari script)
- English
Behavior:
- If Hindi speech is detected → Hindi response + Hindi voice
- If English speech is detected → English response + English voice
- Urdu script is automatically normalized to Hindi
No other languages are supported.
- Lightweight
- Free
- Suitable for basic Hindi and English voice output
- High-quality natural voice
- Better pronunciation
- Requires API key
- Internet connection required
- This project is for educational and demonstration purposes only.
- It is NOT a substitute for professional medical advice.
- AI-generated responses may be incorrect or incomplete.
- No real medical validation layer is implemented.
- Image analysis is prompt-based and not a true computer vision diagnosis system.
- Requires active internet connection.
- Depends on Groq API uptime and rate limits.
- No patient data storage or authentication system.
- Not suitable for emergency or critical medical decisions.
- Never upload your
.envfile. - Keep API keys private.
- Do not expose keys in commits.
- Use environment variables in production deployment.
- Add language selection toggle in UI
- Add medical safety guardrails
- Add disclaimer popup in UI
- Add conversation memory
- Deploy to HuggingFace Spaces
- Add authentication system
- Integrate real computer vision model
- Add structured medical output format
This project is open-source and intended for educational use only.
Developed as an AI bilingual medical assistant project using Groq and ElevenLabs integration.