feat: MMS-300M Integration & Acoustemes Documentation by joaocarvoli · Pull Request #2 · shemaobt/model-training

joaocarvoli · 2026-02-06T21:04:05Z

Summary

This PR integrates the Meta MMS-300M model into the training pipeline and provides comprehensive documentation on acousteme generation and semantic mapping.

Key Changes

MMS-300M Integration: Updated phase1_acoustic and phase2_bpe to support model switching.
Documentation: Added docs/ACOUSTEMES_GENERATION.md in RFC format, detailing RLE logic and future semantic mapping workflows.
Bug Fix: Improved phase3_vocoder_v2.py resume logic.
Utility: Added src/utils/list_checkpoints.py.

Context

Addresses the need for better acoustic tokenization and lays the groundwork for the 'Acoustic Meaning Map'.

…tic mapping

…ning

joaocarvoli added 4 commits February 6, 2026 18:03

docs: add RFC-style documentation for acoustemes generation and seman…

daa2b19

…tic mapping

feat: add utility to list checkpoints on Modal volume

66ed0e6

fix(vocoder): improve resume logic to handle missing files gracefully

1adba15

feat: integrate MMS-300M model for acoustic tokenization and BPE trai…

57db901

…ning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: MMS-300M Integration & Acoustemes Documentation#2

feat: MMS-300M Integration & Acoustemes Documentation#2
joaocarvoli wants to merge 4 commits intomainfrom
obtlab/obt-23-switch-model-xlsr-53-to-mms

joaocarvoli commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joaocarvoli commented Feb 6, 2026

Summary

Key Changes

Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant