RSANS Lyric Alignment & Visualization Engine

Overview

RSANS is a headless, offline-first CLI tool that aligns audio with provided lyrics and generates time-synchronized lyric visualization videos. It uses offline speech recognition to extract word-level timing, maps that timing onto user-supplied lyrics, applies optional rhyme annotations, and renders karaoke-style highlights via FFmpeg. The system is file-driven and deterministic, designed for automation now and UI integration later.

Features

Offline audio processing (no cloud dependency)
Plain-text lyrics input
Word-level timing via offline speech recognition
Deterministic alignment of audio timing to provided lyrics
Optional rhyme-group annotations
Karaoke-style word highlighting and color grouping
Video export with burned-in visuals using FFmpeg
File-driven CLI workflow with reproducible outputs
JSON project state for caching and future UI integration

Installation

Build:

cmake -S . -B build
cmake --build build

Install rsans to ~/.local/bin (make sure this is on your PATH):

cmake --install build --prefix ~/.local

How to Use

TBA

Output Format

RSANS produces a rendered video file with time-synchronized lyric visualization burned into the frames.

Primary outputs:

MP4 video containing the original audio and animated lyric overlays (karaoke-style word highlighting and optional rhyme-based color grouping)
Intermediate artifacts (optional, kept in the project directory):
JSON project state with tokenized lyrics, word timings, and rhyme group assignments
ASS subtitle file used internally for precise timing and animation
Logs describing alignment results and unmatched words

Testing

All tests are built as part of the normal build

Run the full test suite:

cd build
ctest --output-on-failure

(Optional) Run the test binary directly:

cd build/tests
./test_rsans

Limitations

Requires reasonably clean, clearly spoken or sung audio
Lyrics must closely match the audio; large deviations reduce alignment quality
Word-level timing accuracy depends on the speech recognition model
Does not handle overlapping vocals or polyphonic speech
Rhyme detection is user-defined; no automatic rhyme inference
No real-time playback or editing (offline rendering only)
Desktop UI not included (CLI-only)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RSANS Lyric Alignment & Visualization Engine

Overview

Features

Installation

How to Use

Output Format

Testing

Limitations

About

Uh oh!

Releases

Languages

dariustb/rsans-engine

Folders and files

Latest commit

History

Repository files navigation

RSANS Lyric Alignment & Visualization Engine

Overview

Features

Installation

How to Use

Output Format

Testing

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Languages