Live Video Input β Object Detection (TFLite) β Audio Feedback Output
main.py- Main orchestration script that integrates all componentsdetect.py- Core object detection logic using TensorFlow Litemodel_loader.py- Model loading and initialization utilitiesaudio.py- Text-to-speech audio feedback system
og.py- Original TensorFlow SavedModel implementationconvert.py- Model conversion from SavedModel to TensorFlow Litedownload.py- Model acquisition from TensorFlow Hub
model.tflite- Optimized TensorFlow Lite model for inferencemodel.h- C header file for ESP32 deploymentlabel.txt- Object class labels mappinglabels_mobilenet_quant_v1_224.txt- MobileNet label mappings
- Real-time Object Detection: Uses MobileNet SSD v2 for efficient inference
- TensorFlow Lite Optimization: Model optimized for embedded systems
- Audio Feedback: Text-to-speech conversion of detection results
- Accessibility Focus: Designed for visually impaired users
- ESP32 Ready: Code prepared for microcontroller deployment
-
Clone the repository:
git clone https://github.com/RohitAnish1/quad_squad.git cd quad_squad -
Install dependencies:
pip install -r requirements.txt
-
Required packages include:
- TensorFlow/TensorFlow Lite
- OpenCV
- NumPy
- pyttsx3 (for text-to-speech)
- matplotlib (for visualization)
-
Run the main pipeline:
python main.py
-
Test with individual components:
# Original TensorFlow model python og.py # TensorFlow Lite optimized version python detect.py # Audio feedback only python audio.py
Convert SavedModel to TensorFlow Lite:
python convert.pyThe project includes sample images in the images/ directory for testing:
car.jpg,car1.jpg- Vehicle detectionsample.jpg,sample1.jpg,sample2.jpg,sample3.jpg- Various objects
Update paths in main.py and other files:
model_dir = r"path/to/your/model"
label_path = r"path/to/your/labels.txt"
image_path = "path/to/test/image.jpg"Adjust detection sensitivity in main.py:
confidence_threshold = 0.5 # Adjust as needed- Discovers and loads TensorFlow Lite models
- Handles model initialization and tensor allocation
- Provides error handling for model files
- Core inference logic using TensorFlow Lite interpreter
- Image preprocessing and postprocessing
- Bounding box and confidence score extraction
- Visualization with OpenCV
- Integrates pyttsx3 for text-to-speech conversion
- Processes detection results into natural language
- Provides audio output for accessibility
- Orchestrates the complete workflow
- Integrates all components seamlessly
- Handles high-level logic and error management
Xiaesp32s3 Microcontroller
- ARM Cortex-M4 processor
- Limited memory and processing power
- Optimized TensorFlow Lite model for deployment
- Real-time inference capabilities
- Base Model: MobileNet SSD v2
- Input Size: 224x224 pixels
- Output: Bounding boxes, class predictions, confidence scores
- Optimization: TensorFlow Lite with DEFAULT optimizations
- Model size optimized for embedded systems
- Quantization applied for faster inference
- Memory-efficient preprocessing pipeline
- Object detection pipeline with TensorFlow Lite
- Audio feedback system
- Model conversion and optimization
- Core inference logic
- Complete ESP32 deployment
- Live camera integration
- Real-time video processing
- Hardware optimization
The system can detect various objects including:
- Vehicles (cars, trucks, motorcycles)
- People
- Animals
- Common objects (bottles, chairs, etc.)
Audio feedback provides descriptions like:
- "Car detected with 85% confidence"
- "Person detected with 92% confidence"
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
Quad Squad Team
- Focus on accessibility through computer vision
- Embedded systems and AI integration
- Real-time object detection solutions
For questions or collaboration opportunities, please reach out through the GitHub repository.
This project demonstrates the integration of computer vision, embedded systems, and accessibility technologies for real-world applications.

