Object Detection with Audio Feedback System

🎯 Project Overview

This project implements a real-time object detection system designed for deployment on Xiaesp32s3 microcontrollers. The system captures live video, performs object detection using a MobileNet SSD model, and provides audio feedback for detected objects — making it particularly useful for accessibility applications.

While the complete ESP32 deployment wasn't fully implemented, this repository contains all the core components for object detection with TensorFlow Lite optimization and audio feedback generation.

🏗️ Architecture

Live Video Input → Object Detection (TFLite) → Audio Feedback Output

📁 Key Files

Core Implementation

main.py - Main orchestration script that integrates all components
detect.py - Core object detection logic using TensorFlow Lite
model_loader.py - Model loading and initialization utilities
audio.py - Text-to-speech audio feedback system

Model Optimization

og.py - Original TensorFlow SavedModel implementation
convert.py - Model conversion from SavedModel to TensorFlow Lite
download.py - Model acquisition from TensorFlow Hub

Models & Labels

model.tflite - Optimized TensorFlow Lite model for inference
model.h - C header file for ESP32 deployment
label.txt - Object class labels mapping
labels_mobilenet_quant_v1_224.txt - MobileNet label mappings

🚀 Features

Real-time Object Detection: Uses MobileNet SSD v2 for efficient inference
TensorFlow Lite Optimization: Model optimized for embedded systems
Audio Feedback: Text-to-speech conversion of detection results
Accessibility Focus: Designed for visually impaired users
ESP32 Ready: Code prepared for microcontroller deployment

🛠️ Installation

Clone the repository:

git clone https://github.com/RohitAnish1/quad_squad.git
cd quad_squad

Install dependencies:
```
pip install -r requirements.txt
```
Required packages include:
- TensorFlow/TensorFlow Lite
- OpenCV
- NumPy
- pyttsx3 (for text-to-speech)
- matplotlib (for visualization)

💻 Usage

Basic Object Detection

Run the main pipeline:
```
python main.py
```

Test with individual components:

# Original TensorFlow model
python og.py

# TensorFlow Lite optimized version
python detect.py

# Audio feedback only
python audio.py

Model Conversion

Convert SavedModel to TensorFlow Lite:

python convert.py

Sample Images

The project includes sample images in the images/ directory for testing:

car.jpg, car1.jpg - Vehicle detection
sample.jpg, sample1.jpg, sample2.jpg, sample3.jpg - Various objects

🔧 Configuration

Model Paths

Update paths in main.py and other files:

model_dir = r"path/to/your/model"
label_path = r"path/to/your/labels.txt"
image_path = "path/to/test/image.jpg"

Confidence Threshold

Adjust detection sensitivity in main.py:

confidence_threshold = 0.5  # Adjust as needed

🎛️ Components Breakdown

1. Model Loading (`model_loader.py`)

Discovers and loads TensorFlow Lite models
Handles model initialization and tensor allocation
Provides error handling for model files

2. Object Detection (`detect.py`)

Core inference logic using TensorFlow Lite interpreter
Image preprocessing and postprocessing
Bounding box and confidence score extraction
Visualization with OpenCV

3. Audio Feedback (`audio.py`)

Integrates pyttsx3 for text-to-speech conversion
Processes detection results into natural language
Provides audio output for accessibility

4. Main Pipeline (`main.py`)

Orchestrates the complete workflow
Integrates all components seamlessly
Handles high-level logic and error management

🎯 Target Hardware

Xiaesp32s3 Microcontroller

ARM Cortex-M4 processor
Limited memory and processing power
Optimized TensorFlow Lite model for deployment
Real-time inference capabilities

🔬 Technical Details

Model Specifications

Base Model: MobileNet SSD v2
Input Size: 224x224 pixels
Output: Bounding boxes, class predictions, confidence scores
Optimization: TensorFlow Lite with DEFAULT optimizations

Performance Considerations

Model size optimized for embedded systems
Quantization applied for faster inference
Memory-efficient preprocessing pipeline

🚧 Current Status

✅ Completed

Object detection pipeline with TensorFlow Lite
Audio feedback system
Model conversion and optimization
Core inference logic

🔄 In Progress / Future Work

Complete ESP32 deployment
Live camera integration
Real-time video processing
Hardware optimization

📊 Sample Results

The system can detect various objects including:

Vehicles (cars, trucks, motorcycles)
People
Animals
Common objects (bottles, chairs, etc.)

Audio feedback provides descriptions like:

"Car detected with 85% confidence"
"Person detected with 92% confidence"

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

👥 Team

Quad Squad Team

Focus on accessibility through computer vision
Embedded systems and AI integration
Real-time object detection solutions

📞 Contact

For questions or collaboration opportunities, please reach out through the GitHub repository.

This project demonstrates the integration of computer vision, embedded systems, and accessibility technologies for real-world applications.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
__MACOSX		__MACOSX
__pycache__		__pycache__
images		images
ssd_mobilenet_v2_saved_model		ssd_mobilenet_v2_saved_model
.gitignore		.gitignore
README.md		README.md
audio.py		audio.py
convert.py		convert.py
detect.py		detect.py
download.py		download.py
label.txt		label.txt
labels_mobilenet_quant_v1_224.txt		labels_mobilenet_quant_v1_224.txt
main.py		main.py
model.h		model.h
model.tflite		model.tflite
model1.tflite		model1.tflite
model_loader.py		model_loader.py
og.py		og.py
proj2.code-workspace		proj2.code-workspace
requirements.txt		requirements.txt

RohitAnish1/quad_squad

Folders and files

Latest commit

History

Repository files navigation

Object Detection with Audio Feedback System

🎯 Project Overview

🏗️ Architecture

📁 Key Files

Core Implementation

Model Optimization

Models & Labels

🚀 Features

🛠️ Installation

💻 Usage

Basic Object Detection

Model Conversion

Sample Images

🔧 Configuration

Model Paths

Confidence Threshold

🎛️ Components Breakdown

1. Model Loading (model_loader.py)

2. Object Detection (detect.py)

3. Audio Feedback (audio.py)

4. Main Pipeline (main.py)

🎯 Target Hardware

🔬 Technical Details

Model Specifications

Performance Considerations

🚧 Current Status

✅ Completed

🔄 In Progress / Future Work

📊 Sample Results

🤝 Contributing

👥 Team

📞 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

1. Model Loading (`model_loader.py`)

2. Object Detection (`detect.py`)

3. Audio Feedback (`audio.py`)

4. Main Pipeline (`main.py`)

Packages