Recall.ai is an "External Cortex" designed to assist individuals with early-stage Alzheimer’s and dementia. It acts as a persistent, evolving memory layer that bridges visual perception with episodic history, answering the question: "Who is this, and how do I know them?"
The Challenge: Dementia strips individuals of their social context, leading to anxiety, loss of identity, and immense burden on caregivers. Current tools (trackers/reminders) address logistics, not connection.
The Solution: Recall.ai uses Multimodal Vector Search (Qdrant) to instantly recognize faces and retrieve specific, context-aware memories (e.g., "This is your grandson, he lives in Boston"), delivering them via comforting audio.
- Visual Input: The "Patient" captures an image via camera or upload.
- Vectorization:
- Visual:
DeepFace(VGG-Face) generates a 4096-dim vector. - Text:
FastEmbedgenerates a 384-dim vector for context notes.
- Visual:
- Memory Retrieval (Qdrant):
- Identity Search: Queries the
semantic_identitycollection. - Safety Check: Dynamic thresholding prevents false positives ("Stranger Danger").
- Context Filtering: Uses Payload Indexing to fetch episodic memories linked to the identified person.
- Identity Search: Queries the
- Output: Synthesizes visual identity + text history into an audio response via
gTTS.
We don't just store text. We link Visual Vectors (Faces) with Textual Vectors (Episodes) using shared metadata, creating a true multimodal knowledge graph.
People age. Recall.ai supports Multi-Vector Identity. If you upload a photo of a person from 20 years ago, Qdrant adds a new vector point to the existing Identity Cluster. The system can now recognize both versions of the person without degrading accuracy.
To prevent "Black Box" anxiety, every output includes a debug log showing:
- The exact Confidence Score.
- The Vector ID retrieved.
- The logic used to accept/reject the match.
- Local Storage: Uses Qdrant's local disk persistence (
./qdrant_data), ensuring medical data never leaves the device unexpectedly. - Safety Thresholds: A configurable confidence slider allows caregivers to tune sensitivity based on the patient's environment.
- Vector Engine: Qdrant (Production Mode with Payload Indexing)
- Frontend: Streamlit
- Vision Model: DeepFace (VGG-Face)
- Text Model: FastEmbed (BAAI/bge-small-en-v1.5)
- Audio: gTTS (Google Text-to-Speech)
- Python 3.9+
- pip
git clone https://github.com/drshvik/recall_ai.git
cd recall_aipip install -r requirements.txtstreamlit run app.pyNote: On the first run, the system will automatically download the necessary AI models (approx. 500MB).