LipReaderAI presents a novel approach in assistive technologies, designed to overcome communication barriers for individuals with hearing impairments. Our model leverages the synergy between spatiotemporal convolutions and recurrent neural network architectures to accurately transcribe spoken language from visual input across multiple speakers. To eliminate the need for manual segmentation, we employ Connectionist Temporal Classification (CTC) loss, which facilitates direct, end-to-end training from unprocessed video input to textual transcription. This project tests the robustness of the model by evaluating its accuracy on a custom dataset featuring recordings of the author, ensuring it can reliably translate visual speech. LipReaderAI not only aims to improve accessibility for those with hearing impairments but also extends the use of lipreading into loud environments where traditional speech recognition systems that rely solely on audio are ineffective.
This guide will walk you through setting up the Python environment required to run the projects in the Lip_reading repository.
Ensure you have Python installed on your system. You can download Python from python.org.
First, clone the LiveLipNet-Duo repository to your local machine:
git clone https://github.com/33ron33/Lip_reading.gitCreate a virtual environment nmaes 'LiveLipNet-env' within the repository directory:
python -m venv LiveLipNet-envActivate the Virtual Environment
- On macOS/Linux:
source LiveLipNet-env/bin/activate- On Windows:
LiveLipNet-env\Scripts\activateInstall the project dependencies from the 'requirements.txt' file:
pip install -r requirements.txtVerify Dependencies:
pip listTo set up the IPython kernel for the virtual environment, run the following command:
python -m ipykernel install --name=LiveLipNet-envdeactivate
