Skip to content

dylancc5/assistify

Repository files navigation

Assistify

A multimodal visual-audio companion that empowers seniors to navigate smartphones safely, confidently, and independently.

Youtube Cover

Writeup: https://openatom.tech/antheaguo/Assistify

Our Mission

Assistify bridges the digital divide for elderly users by transforming complex smartphone interfaces into intuitive, accessible experiences. We believe that technology should empower everyone, regardless of age or technical proficiency, to communicate, access services, and confidently participate fully in digital life.

The Problem We're Solving

Digital Exclusion in the Elderly Population

Smartphones have become essential gateways to healthcare, communication, and financial services. Yet for millions of elderly users, these tools remain frustratingly inaccessible. The challenges are real and multifaceted:

  • Dense, Confusing Interfaces: Modern apps evolve rapidly with ambiguous icons, nested menus, and cluttered screens that demand cognitive and visual agility many seniors struggle with
  • Digital Tasks That Overwhelm: Simple actions like paying bills, sending photos, or booking appointments require navigation skills that lead to anxiety and avoidance
  • Inadequate Existing Solutions: Current accessibility features like enlarged text or voice input provide only static, one-dimensional support without contextual awareness
  • Safety Vulnerabilities: Elderly users are disproportionately targeted by phishing scams, fraudulent links, and misleading prompts that exploit their unfamiliarity with digital threats
  • Loss of Independence: Digital exclusion forces seniors to depend on others for daily tasks, eroding confidence and autonomy

The Gap in Current Solutions

Speech-only assistants like Siri or Alexa are passive and limited—they execute voice commands without understanding visual context or adapting to individual users. Traditional accessibility features lack the intelligence to:

  • Guide users dynamically through app interfaces
  • Prevent critical errors like tapping malicious links
  • Understand what appears on-screen in real-time
  • Provide personalized, context-aware assistance
  • Adapt to individual interaction histories and learning paces

The core pain point: Elderly users need multimodal, real-time, contextually aware digital guidance that bridges the cognitive and sensory gap between human intent and machine interface.


Our Solution

An Intelligent, Context-Aware Digital Companion

Assistify is more than an accessibility tool—it's a digital caregiver that sees what users see, understands their intent, and guides them through every step with patience and clarity.

How Assistify Works Differently

Unlike traditional assistants that operate blindly, Assistify perceives and comprehends your smartphone screen in real-time:

Visual Intelligence

  • Captures and analyzes your screen continuously
  • Identifies app types, UI elements, buttons, and interactive regions
  • Recognizes text, icons, and layout structure with precision
  • Detects suspicious content, scams, and potentially harmful elements

Contextual Understanding

  • Infers user intent from screen content and dialogue history
  • Maps possible actions and prioritizes the most relevant next steps
  • Adapts guidance based on the current app and the user's goal
  • Maintains memory of past interactions for personalized support

Multimodal Guidance

  • Provides visual overlays with highlights, arrows, and bounding boxes that direct attention to the right place
  • Delivers spoken narration with clear, empathetic, step-by-step instructions
  • Generates conversational explanations that clarify ambiguous elements and unfamiliar terms
  • Creates an AR-like experience that seamlessly blends digital assistance with real interface interaction

Core Features & Capabilities

1. Real-Time Screen Understanding

Assistify continuously monitors your smartphone screen to identify:

  • What app are you using
  • What you're trying to accomplish
  • Which buttons or options are relevant to your goal
  • Potential obstacles or confusing elements

2. Conversational Assistance

Engage naturally with Assistify through voice:

  • Ask questions like "How do I reply to this message?" or "What does this icon mean?"
  • Receive clear, jargon-free explanations
  • Get help understanding unfamiliar slang or abbreviations in messages
  • Navigate interfaces through natural dialogue

3. Safety & Scam Detection

Assistify acts as a protective layer between you and digital threats:

  • Detects phishing attempts and fraudulent links
  • Warns you before interacting with suspicious prompts
  • Identifies misleading advertisements disguised as interface elements
  • Prevents accidental deletions or destructive actions

4. Personalized Learning

The more you use Assistify, the better it understands you:

  • Remembers your preferences and common tasks
  • Adapts explanations to your learning pace
  • Recognizes your interaction patterns
  • Becomes increasingly efficient at predicting your needs

5. Cross-App Continuity

Assistify follows you throughout your phone:

  • Provides consistent guidance across different apps
  • Maintains context when you switch between applications
  • Helps you navigate complex workflows that span multiple apps
  • Never leaves your side during multi-step tasks

Use Cases & Real-World Scenarios

Communication & Social Connection

Messaging Family & Friends

  • "How do I send a photo to my daughter?"
  • Assistify identifies the messaging app, highlights the attachment icon, guides you through photo selection, and confirms before sending

Video Calling Grandchildren

  • "I want to video call my grandson."
  • Step-by-step guidance through opening the app, finding the contact, and initiating the call with confidence

Understanding Modern Slang

  • Receives a message with unfamiliar abbreviations or emojis
  • Assistify explains what "LOL" means or clarifies the tone of a message

Healthcare Access

Booking Medical Appointments

  • Navigating complex healthcare provider apps
  • Guided through date selection, insurance information entry, and appointment confirmation

Medication Reminders & Information

  • Understanding prescription app interfaces
  • Finding medication details, refill options, and pharmacy locations

Telehealth Consultations

  • Joining video appointments with medical providers
  • Ensuring the camera, microphone, and connection are working properly

Financial Services

Mobile Banking Tasks

  • Checking account balances safely
  • Transferring money between accounts with confirmation prompts
  • Understanding transaction histories and statements

Bill Payments

  • Navigating utility company apps
  • Entering payment information securely
  • Verifying amounts before confirming payments

Fraud Prevention

  • Detecting suspicious payment requests
  • Warning against phishing attempts disguised as bank notifications
  • Confirming the legitimacy of financial communications

Daily Living & Independence

Online Shopping

  • Browsing products, reading reviews, and comparing options
  • Adding items to the cart and completing checkout processes
  • Tracking deliveries and managing orders

Transportation & Navigation

  • Booking rideshare services like Uber or Lyft
  • Using map applications to find locations
  • Understanding arrival times and driver information

Entertainment & Hobbies

  • Navigating streaming services to watch shows
  • Managing music apps and playlists
  • Exploring new apps for hobbies and interests

Value Highlights

Human-Centered Accessibility

Converts complex digital environments into intuitive, step-by-step visual and audio cues tailored for senior users' cognitive and sensory needs.

Contextual Awareness

Understands live screen context to deliver relevant, timely guidance instead of generic, one-size-fits-all instructions that don't match the current situation.

Safety Layer

Actively protects users from digital threats, preventing misclicks on malicious content and improving trust in digital interactions.

Confidence Building

Empowers elderly users to attempt new tasks independently, reducing anxiety and building digital literacy through supportive, patient guidance.

Personalized Experience

Adapts based on individual behavioral patterns and learning pace, creating a truly "digital caregiver" experience that respects each user's unique needs.

Dignity & Independence

Reduces dependence on family members or caregivers for basic digital tasks, preserving autonomy and self-sufficiency.


Why This Matters

The Demographic Reality

Global populations are aging rapidly. By 2050, the number of people aged 60+ will double to over 2 billion. This demographic shift creates urgent need for age-inclusive technology.

The Cost of Digital Exclusion

When elderly individuals cannot access digital services:

  • Healthcare becomes harder to manage
  • Social connections weaken
  • Financial independence erodes
  • Quality of life diminishes
  • Societal care burdens increase

The Opportunity for Technology

Modern AI possesses the multimodal capabilities needed to solve this problem. What was impossible five years ago—real-time screen understanding, contextual reasoning, and adaptive guidance—is now achievable.

Assistify represents a commitment to using these capabilities for profound social good.


Beyond Seniors: A Platform for Universal Accessibility

While our initial focus is elderly users, Assistify's core capabilities have potential to serve:

  • Users with visual impairments who need detailed screen descriptions and navigation assistance
  • Individuals with cognitive disabilities who benefit from simplified, step-by-step guidance
  • Non-native speakers navigating interfaces in unfamiliar languages
  • Anyone learning new technology who wants patient, judgment-free support
  • Users in high-stress situations who need clear guidance under pressure

Assistify represents a new paradigm: Contextually aware, multimodal AI assistance that meets users wherever they are.


Installation & Setup Process

Important Note: Assistify is currently in development and not available on the App Store. Installation requires a development environment setup.

Prerequisites:

  • macOS with Xcode installed
  • Flutter SDK and Dart SDK installed
  • Physical iPhone device (iOS 16.0+)
  • Apple Developer account with development permissions
  • USB cable for device connection

Step 1: Development Environment Setup

# Install Flutter
git clone https://github.com/flutter/flutter.git
export PATH="$PATH:`pwd`/flutter/bin"

# Verify installation
flutter doctor

Step 2: Clone Repository

git clone [your-repo-url]
cd assistify

Step 3: Configure Project Settings

  1. Open ios/Runner.xcworkspace in Xcode
  2. Update App Group identifiers (currently specific to development machine)
  3. Add your API keys to environment variables (keys are gitignored for security)
    • Create a .env file in the project root with:
      BAIDU_API_KEY=your_baidu_api_key
      BAIDU_SECRET_KEY=your_baidu_secret_key
      SUPABASE_URL=your_supabase_url
      SUPABASE_ANON_KEY=your_supabase_anon_key
      
  4. Update team signing certificate with your Apple Developer account

Step 4: Connect Device & Deploy

  1. Enable Developer Mode on iPhone

    • Settings → Privacy & Security → Developer Mode → Enable
  2. Connect iPhone via USB

    • Trust the computer when prompted
  3. Run deployment

    flutter run

Step 5: Complete In-App Setup

Once installed, the app guides you through:

  • Voice Selection: Choose from available iOS text-to-speech voices
  • Microphone Permission: Required for voice input
  • Screen Recording Permission: Required for screen understanding
  • Background App Refresh: Enable for continuous operation
  • Onboarding Tutorial: Practice with sample scenarios

How to Use Assistify (Post-Installation)

Two Operational Modes:

Mode 1: Chat Only (No Screen Context)

  • Activate: Open Assistify and tap the microphone button
  • Speak Naturally: Ask general questions or have conversations
  • Listen: Assistify responds based on conversation history and RAG context
  • Token Usage: 150-750 tokens per query

Mode 2: Chat + Screen Share (With Visual Context)

  • Start Screen Share: Enable screen broadcasting in Control Center
  • Continuous Capture: App samples screenshots at 1 FPS
  • Speak Your Goal: State what you need help with, e.g., "How do I reply to this text?"
  • Contextual Guidance: Assistify analyzes current screen + 10 evenly-sampled screenshots + RAG context
  • Follow Instructions: Perform spoken actions while Assistify continuously re-analyzes screen
  • Token Usage: 5,000-10,000 tokens per query (includes image context + RAG retrieval)

Technical Documentation

Project Structure Overview

The Assistify Flutter application follows a clean, modular architecture organized by feature and responsibility. This section provides a comprehensive guide to the codebase structure, making it easy for developers and AI agents to understand where to find and modify code.

assistify/
├── lib/
│   ├── main.dart                    # App entry point and initialization
│   ├── constants/                   # Design system constants
│   │   ├── colors.dart              # Color palette and theme colors
│   │   ├── dimensions.dart          # Spacing, sizes, and responsive utilities
│   │   ├── text_styles.dart         # Typography system
│   │   └── theme.dart               # App theme configuration
│   ├── models/                      # Data models and business logic
│   │   ├── conversation.dart       # Conversation history model
│   │   ├── message.dart            # Individual message/speech segment model
│   │   ├── preferences.dart        # User preferences model
│   │   └── screen_recording.dart   # Screen recording metadata model
│   ├── providers/                   # State management (Provider pattern)
│   │   └── app_state_provider.dart # Central app state provider
│   ├── screens/                     # Full-screen UI components
│   │   ├── home_screen.dart        # Main home screen with voice agent
│   │   ├── settings_screen.dart    # Settings and preferences screen
│   │   ├── history_screen.dart     # Conversation history screen
│   │   ├── screen_recording_history_screen.dart # Screen recording history
│   │   ├── voice_selection_screen.dart # Voice selection interface
│   │   ├── privacy_policy_screen.dart # Privacy policy display
│   │   └── terms_of_service_screen.dart # Terms of service display
│   ├── services/                    # Business logic and platform integration
│   │   ├── baidu_service.dart       # Baidu ERNIE Bot API integration
│   │   ├── embedding_service.dart  # Embedding generation for RAG
│   │   ├── baidu_service.dart     # Legacy service (kept for embeddings)
│   │   ├── native_log_service.dart # Native logging service
│   │   ├── permission_service.dart  # Permission handling (mic, screen recording)
│   │   ├── recording_service.dart  # Screen recording via platform channels
│   │   ├── screen_stream_service.dart # Continuous screen capture streaming
│   │   ├── speech_service.dart      # Speech recognition via platform channels
│   │   ├── storage_service.dart     # Local data persistence (SharedPreferences)
│   │   └── tts_service.dart        # Text-to-speech service
│   ├── utils/                       # Utility functions and helpers
│   │   └── localization_helper.dart # Localization utilities
│   ├── widgets/                     # Reusable UI components
│   │   ├── voice_agent_circle.dart  # Animated voice agent circle widget
│   │   ├── control_button.dart     # Control button component
│   │   ├── onboarding_flow.dart    # Onboarding permission flow
│   │   ├── permission_modal.dart   # Permission request modal
│   │   ├── screen_recording_card.dart # Screen recording card widget
│   │   └── legal_document_widgets.dart # Legal document display widgets
│   └── l10n/                        # Localization files
│       ├── app_en.arb              # English translations
│       ├── app_zh.arb              # Chinese translations
│       └── app_localizations*.dart  # Generated localization code
├── ios/
│   ├── Runner/                      # Main iOS app target
│   │   ├── AppDelegate.swift       # iOS app delegate with background processing
│   │   └── ...
│   └── Asstify Screenshare/        # Broadcast extension for screen capture
│       └── SampleHandler.swift     # Screen broadcast handler
├── android/                         # Android platform code
├── test/                           # Test files and benchmarks
│   ├── benchmark_dataset.json      # Benchmark test dataset
│   ├── run_benchmark.dart          # Benchmark runner
│   ├── compare_results.dart        # Result comparison utilities
│   └── README.md                   # Testing documentation
├── supabase_function.sql           # Supabase RPC function for vector search
├── pubspec.yaml                    # Flutter dependencies and configuration
└── README.md                       # This file

Powered by advanced multimodal AI that sees, understands, and guides—transforming digital accessibility from a feature into a fundamental right.

About

ernie hackathon

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors