Skip to content

YousefSamm/smart_vision_assistant

Repository files navigation

Smart Vision Assistant - Raspberry Pi 4B

A professional smart glass system with 4 operational modes controlled by 3 push buttons.

🎯 Features

  • Time Mode: Announces current time every minute
  • Text Recognition: Uses OCR to read text from camera feed
  • Object Detection: YOLOv8 object detection with audio description
  • Distance Measurement: Ultrasonic distance monitoring with warnings
  • Audio Feedback: Text-to-speech for all operations
  • Button Control: 3-button interface for mode selection
  • Multi-threaded: Efficient concurrent operation
  • Professional Architecture: Modular, maintainable codebase

πŸ“‹ Hardware Requirements

  • Raspberry Pi 4B
  • Pi Camera Module
  • 3 Push Buttons with external pull-down resistors
  • HC-SR04 Ultrasonic Sensor
  • Audio output (speakers/headphones)

πŸ”Œ Pin Connections

Buttons (with external pull-down resistors)

  • Button 1 (Mode Selection): GPIO 36 (BOARD pin 36)
  • Button 2 (Confirm): GPIO 38 (BOARD pin 38)
  • Button 3 (Exit/Idle): GPIO 40 (BOARD pin 40)

Ultrasonic Sensor

  • TRIG: GPIO 7 (BOARD pin 7)
  • ECHO: GPIO 11 (BOARD pin 11)

See docs/wiring_diagram.md for detailed wiring information.

πŸš€ Installation

1. Clone the Repository

git clone https://github.com/YousefSamm/smart_vision_assistant.git
cd smart_vision_assistant

2. Install Python Dependencies

pip install -r requirements.txt

Or install as a package:

pip install -e .

3. Install System Packages

sudo apt update
sudo apt install tesseract-ocr python3-pygame libcamera-tools

4. Enable Camera

sudo raspi-config

Navigate to Interface Options β†’ Camera β†’ Enable

5. Run Installation Script (Optional)

chmod +x scripts/install.sh
./scripts/install.sh

πŸ’» Usage

Starting the System

Option 1: Using the run script

python3 run.py

Option 2: Using the module directly

python3 -m smart_glass.main

Option 3: Using the installed command (after pip install -e .)

smart-glass

Button Operations

  1. Button 1 (Mode): Cycle through 4 modes

    • Press to switch: Idle β†’ Time β†’ Text Recognition β†’ Object Detection β†’ Distance Measurement β†’ Idle
    • Interrupts current audio and announces new mode
  2. Button 2 (Confirm): Confirm and activate selected mode

    • Activates the currently selected mode
    • Interrupts any playing audio
  3. Button 3 (Exit): Exit current mode and return to idle

    • Stops current mode operation
    • Returns to idle state

Modes

1. Time Mode

  • Announces current time every minute
  • Format: "The current time is HH:MM AM/PM"

2. Text Recognition Mode

  • Captures frames from camera every 5 seconds
  • Performs OCR using Tesseract
  • Speaks detected text: "I can see the following text: [text]"

3. Object Detection Mode

  • Uses YOLOv8 for real-time object detection
  • Updates every 3 seconds
  • Speaks detected objects: "I can see one person, two chairs"

4. Distance Measurement Mode

  • Takes initial distance reading when activated
  • Continuously monitors distance every 1 second
  • Warns when distance < 100cm: "Warning! Distance is X.X centimeters"

πŸ“ Project Structure

smart_vision_assistant/
β”œβ”€β”€ smart_glass/              # Main package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py               # Main entry point
β”‚   β”œβ”€β”€ config.py             # Configuration (optional)
β”‚   β”‚
β”‚   β”œβ”€β”€ hardware/             # Hardware interfaces
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ gpio_handler.py  # GPIO button handling
β”‚   β”‚   β”œβ”€β”€ camera_handler.py # Camera operations
β”‚   β”‚   └── ultrasonic.py     # Ultrasonic sensor
β”‚   β”‚
β”‚   β”œβ”€β”€ modes/                # Mode implementations
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ base_mode.py      # Base class for modes
β”‚   β”‚   β”œβ”€β”€ time_mode.py
β”‚   β”‚   β”œβ”€β”€ text_recognition.py
β”‚   β”‚   β”œβ”€β”€ object_detection.py
β”‚   β”‚   └── distance_measurement.py
β”‚   β”‚
β”‚   β”œβ”€β”€ audio/                # Audio handling
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ tts_engine.py     # Text-to-speech
β”‚   β”‚   └── audio_queue.py    # Audio queue management
β”‚   β”‚
β”‚   └── utils/                # Utilities
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── logger.py         # Logging utilities
β”‚
β”œβ”€β”€ tests/                    # Test files
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ test_buttons.py
β”‚   β”œβ”€β”€ test_camera.py
β”‚   └── test_display.py
β”‚
β”œβ”€β”€ scripts/                  # Utility scripts
β”‚   └── install.sh
β”‚
β”œβ”€β”€ docs/                     # Documentation
β”‚   └── wiring_diagram.md
β”‚
β”œβ”€β”€ .gitignore
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md                 # This file
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ setup.py                  # Package installation
└── run.py                    # Entry point script

βš™οΈ Configuration

You can customize the system by creating a config.py file in the root directory:

# GPIO Pin Configuration
MODE_BUTTON_PIN = 36
CONFIRM_BUTTON_PIN = 38
EXIT_BUTTON_PIN = 40
TRIG_PIN = 7
ECHO_PIN = 11

# Button Configuration
BUTTON_DEBOUNCE_TIME = 0.5  # seconds

See config.py (if exists) for more configuration options.

πŸ› Troubleshooting

Camera Issues

  • No camera detected: Check camera connections and enable in raspi-config
  • Camera access denied: Run with sudo or add user to video group: sudo usermod -a -G video $USER
  • OpenCV errors: Install libcamera-tools: sudo apt install libcamera-tools

Audio Issues

  • No audio output: Check audio output configuration: sudo raspi-config β†’ Advanced Options β†’ Audio
  • TTS not working: Ensure internet connection for gTTS (or use offline TTS)

GPIO Errors

  • Permission denied: Run with sudo or add user to gpio group: sudo usermod -a -G gpio $USER
  • Button not responding: Verify button connections and pull-down resistors

Performance Issues

  • YOLO slow: Consider using TensorFlow Lite or smaller YOLO model
  • High CPU usage: Reduce update intervals in mode configurations

πŸ§ͺ Testing

Run individual test scripts:

python3 tests/test_buttons.py
python3 tests/test_camera.py
python3 tests/test_display.py

πŸ“ Development

Adding a New Mode

  1. Create a new file in smart_glass/modes/
  2. Inherit from BaseMode
  3. Implement the _run() method
  4. Add to smart_glass/modes/__init__.py
  5. Register in smart_glass/main.py

Code Style

  • Follow PEP 8 style guide
  • Use type hints where appropriate
  • Add docstrings to all classes and methods
  • Keep functions focused and modular

πŸ“„ License

See LICENSE file for details.

πŸ‘€ Author

Yousef Samm

πŸ™ Acknowledgments

  • YOLOv8 by Ultralytics
  • Tesseract OCR
  • Raspberry Pi Foundation
  • OpenCV community

πŸ“ž Support

For issues and questions, please open an issue on GitHub.


Note: This project requires Raspberry Pi 4B with proper hardware setup. Ensure all connections are secure before running the system.

About

AI-powered wearable device for visually impaired users. Features real-time object detection (YOLO), OCR text-to-speech, and audio feedback using Raspberry Pi.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors