Skip to content

a digital instrument you control with only your mouth (and midi notes)

Notifications You must be signed in to change notification settings

deancureton/virtual-talkbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

virtual talkbox

a virtual talkbox using computer vision to track your mouth movements and shape sound with formant filters. basically: you play notes (midi or qwerty keyboard) and your mouth controls how they sound!

what is this

with a talkbox, you play a synth through a tube into your mouth, and shape the sound with vowel movements. this does the same thing but with a webcam instead of a tube. it uses mediapipe to track your mouth, calculates approximate formant frequencies (F1, F2, F3) based on your mouth shape, and filters a sawtooth wave in real-time. the result sounds like you're "singing" (kinda) the notes you play!

setup

macOS

# install system dependencies (required for pyo audio library)
brew install portaudio portmidi liblo libsndfile

# create and activate virtual environment
python3 -m venv venv
source venv/bin/activate

# install python dependencies
pip install -r requirements.txt

# install pyo from github (the PyPI version has build issues)
C_INCLUDE_PATH="/opt/homebrew/include" LIBRARY_PATH="/opt/homebrew/lib" pip install git+https://github.com/belangeo/pyo.git

# run it
python main.py

Linux

# install system dependencies (debian/ubuntu)
sudo apt-get install portaudio19-dev libportmidi-dev liblo-dev libsndfile1-dev

# create and activate virtual environment
python3 -m venv venv
source venv/bin/activate

# install python dependencies
pip install -r requirements.txt

# install pyo
pip install pyo

# run it
python main.py

Windows

# create and activate virtual environment
python -m venv venv
venv\Scripts\activate

# install python dependencies
pip install -r requirements.txt

# install pyo (pre-built wheels available on Windows)
pip install pyo

# run it
python main.py

usage

when you start it up, you'll get a menu to choose your input:

  • midi keyboard - if you have one plugged in
  • computer keyboard - if you don't

the app will remember your choice for next time.

keyboard layout

if you're using qwerty keyboard mode, it's set up like a piano:

black keys:  w e   t y u   o p
white keys: a s d f g h j k l ; '

extra controls:

  • z/x - change octave
  • c - toggle vibrato
  • arrow up/down - pitch bend

playing

  1. press a key to play a note
  2. move your mouth while the note is playing
  3. experiment with different vowel shapes:
    • "ah" = open mouth
    • "ee" = wide smile
    • "oo" = rounded lips

sound only plays when a note is pressed AND your face is detected

how it works

  • face tracking: mediapipe facemesh for mouth landmark detection
  • formant mapping:
    • F1 (270-730hz) controlled by jaw opening
    • F2 (870-2290hz) controlled by lip width
    • F3 (1650-3000hz) combination of both
  • audio: pyo synthesizer with supersaw oscillator + formant bandpass filters
  • threading: separate threads for video, audio, and input

config

creates a config.json file where you can tweak:

  • audio buffer size (lower = less latency, higher = more stability)
  • camera device id
  • formant frequency ranges
  • debug display settings

credits

uses mediapipe, pyo, opencv, mido, and pynput

About

a digital instrument you control with only your mouth (and midi notes)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages