A professional smart glass system with 4 operational modes controlled by 3 push buttons.
- Time Mode: Announces current time every minute
- Text Recognition: Uses OCR to read text from camera feed
- Object Detection: YOLOv8 object detection with audio description
- Distance Measurement: Ultrasonic distance monitoring with warnings
- Audio Feedback: Text-to-speech for all operations
- Button Control: 3-button interface for mode selection
- Multi-threaded: Efficient concurrent operation
- Professional Architecture: Modular, maintainable codebase
- Raspberry Pi 4B
- Pi Camera Module
- 3 Push Buttons with external pull-down resistors
- HC-SR04 Ultrasonic Sensor
- Audio output (speakers/headphones)
- Button 1 (Mode Selection): GPIO 36 (BOARD pin 36)
- Button 2 (Confirm): GPIO 38 (BOARD pin 38)
- Button 3 (Exit/Idle): GPIO 40 (BOARD pin 40)
- TRIG: GPIO 7 (BOARD pin 7)
- ECHO: GPIO 11 (BOARD pin 11)
See docs/wiring_diagram.md for detailed wiring information.
git clone https://github.com/YousefSamm/smart_vision_assistant.git
cd smart_vision_assistantpip install -r requirements.txtOr install as a package:
pip install -e .sudo apt update
sudo apt install tesseract-ocr python3-pygame libcamera-toolssudo raspi-configNavigate to Interface Options β Camera β Enable
chmod +x scripts/install.sh
./scripts/install.shOption 1: Using the run script
python3 run.pyOption 2: Using the module directly
python3 -m smart_glass.mainOption 3: Using the installed command (after pip install -e .)
smart-glass-
Button 1 (Mode): Cycle through 4 modes
- Press to switch: Idle β Time β Text Recognition β Object Detection β Distance Measurement β Idle
- Interrupts current audio and announces new mode
-
Button 2 (Confirm): Confirm and activate selected mode
- Activates the currently selected mode
- Interrupts any playing audio
-
Button 3 (Exit): Exit current mode and return to idle
- Stops current mode operation
- Returns to idle state
- Announces current time every minute
- Format: "The current time is HH:MM AM/PM"
- Captures frames from camera every 5 seconds
- Performs OCR using Tesseract
- Speaks detected text: "I can see the following text: [text]"
- Uses YOLOv8 for real-time object detection
- Updates every 3 seconds
- Speaks detected objects: "I can see one person, two chairs"
- Takes initial distance reading when activated
- Continuously monitors distance every 1 second
- Warns when distance < 100cm: "Warning! Distance is X.X centimeters"
smart_vision_assistant/
βββ smart_glass/ # Main package
β βββ __init__.py
β βββ main.py # Main entry point
β βββ config.py # Configuration (optional)
β β
β βββ hardware/ # Hardware interfaces
β β βββ __init__.py
β β βββ gpio_handler.py # GPIO button handling
β β βββ camera_handler.py # Camera operations
β β βββ ultrasonic.py # Ultrasonic sensor
β β
β βββ modes/ # Mode implementations
β β βββ __init__.py
β β βββ base_mode.py # Base class for modes
β β βββ time_mode.py
β β βββ text_recognition.py
β β βββ object_detection.py
β β βββ distance_measurement.py
β β
β βββ audio/ # Audio handling
β β βββ __init__.py
β β βββ tts_engine.py # Text-to-speech
β β βββ audio_queue.py # Audio queue management
β β
β βββ utils/ # Utilities
β βββ __init__.py
β βββ logger.py # Logging utilities
β
βββ tests/ # Test files
β βββ __init__.py
β βββ test_buttons.py
β βββ test_camera.py
β βββ test_display.py
β
βββ scripts/ # Utility scripts
β βββ install.sh
β
βββ docs/ # Documentation
β βββ wiring_diagram.md
β
βββ .gitignore
βββ LICENSE
βββ README.md # This file
βββ requirements.txt
βββ setup.py # Package installation
βββ run.py # Entry point script
You can customize the system by creating a config.py file in the root directory:
# GPIO Pin Configuration
MODE_BUTTON_PIN = 36
CONFIRM_BUTTON_PIN = 38
EXIT_BUTTON_PIN = 40
TRIG_PIN = 7
ECHO_PIN = 11
# Button Configuration
BUTTON_DEBOUNCE_TIME = 0.5 # secondsSee config.py (if exists) for more configuration options.
- No camera detected: Check camera connections and enable in raspi-config
- Camera access denied: Run with
sudoor add user to video group:sudo usermod -a -G video $USER - OpenCV errors: Install libcamera-tools:
sudo apt install libcamera-tools
- No audio output: Check audio output configuration:
sudo raspi-configβ Advanced Options β Audio - TTS not working: Ensure internet connection for gTTS (or use offline TTS)
- Permission denied: Run with
sudoor add user to gpio group:sudo usermod -a -G gpio $USER - Button not responding: Verify button connections and pull-down resistors
- YOLO slow: Consider using TensorFlow Lite or smaller YOLO model
- High CPU usage: Reduce update intervals in mode configurations
Run individual test scripts:
python3 tests/test_buttons.py
python3 tests/test_camera.py
python3 tests/test_display.py- Create a new file in
smart_glass/modes/ - Inherit from
BaseMode - Implement the
_run()method - Add to
smart_glass/modes/__init__.py - Register in
smart_glass/main.py
- Follow PEP 8 style guide
- Use type hints where appropriate
- Add docstrings to all classes and methods
- Keep functions focused and modular
See LICENSE file for details.
Yousef Samm
- GitHub: @YousefSamm
- YOLOv8 by Ultralytics
- Tesseract OCR
- Raspberry Pi Foundation
- OpenCV community
For issues and questions, please open an issue on GitHub.
Note: This project requires Raspberry Pi 4B with proper hardware setup. Ensure all connections are secure before running the system.