A tongue-in-cheek “AI” playlist generator.
In reality it’s TF-IDF text matching + Gaussian BPM scoring + hand-rolled heuristics.
No model training, no cloud, no adaptation. Just reproducible math that often makes good starting playlists.
⚠️ Status: experimental, imperfect, and intentionally modest.
Think of this as a nerdy shuffler with opinions — not Spotify’s brain.
- Scans your music library (FLAC/MP3/OPUS/etc.).
- Scores tracks against your prompt + a preset:
- study (~85 BPM, vocals discouraged)
- running (~168 BPM, tempo-driven)
- sleep (~60 BPM, super low energy)
- romantic (~88 BPM, warm/jazz/soul)
- dance_party (~124 BPM, club tempo)
- Writes M3U or XSPF playlists with relative paths (Rockbox-friendly).
Optional: estimates BPM with
librosaif your tags don’t have it.
sudo pacman -S python-mutagen python-scikit-learn python-numpy python-librosa python-soundfilelibrosa is optional — enable analysis only if you pass --analyze-tempo.
# Study (M3U)
python ./ai_playlist.py --music-dir "$HOME/Music" --prompt "study: calm, instrumental, ambient" --preset study --format m3u --out "$HOME/Playlists/Study.m3u"
# Running (M3U)
python ./ai_playlist.py --music-dir "$HOME/Music" --prompt "tempo for 8k run" --preset running --format m3u --out "$HOME/Playlists/Run.m3u"
# Sleep (XSPF)
python ./ai_playlist.py --music-dir "$HOME/Music" --prompt "sleep, drones, soft piano" --preset sleep --format xspf --out "$HOME/Playlists/Sleep.xspf"
# Dance party with pairings bias (see below)
python ./ai_playlist.py --music-dir "$HOME/Music" --prompt "party, upbeat" --preset dance_party --format m3u --pairings "$HOME/pairings.json" --out "$HOME/Playlists/Party.m3u"-
Library Scan
- Recursively walks
--music-dir, collecting supported audio files. - Reads tags via
mutagen:title, artist, album, albumartist, genre, comment, lyrics, bpm, duration. - If
--analyze-tempois enabled, estimates BPM from audio withlibrosa.
- Recursively walks
-
Text Modeling (TF-IDF)
- Each track’s tags are concatenated into a “document.”
scikit-learnTfidfVectorizerscores relevance against the--promptplus preset “prompt_enrich” hints.- Falls back to keyword overlap if scikit-learn isn’t installed.
- Scores normalized to [0,1].
-
Tempo Scoring
- Compares BPM (tag or estimated) to a preset Gaussian target (e.g., running 168 ± 18 BPM).
- Produces a [0,1] “tempo fit.”
-
Penalty/Bonus Heuristics
- Penalizes: live, remix/remaster/radio edit, feat., lyrics present, vocal genres (weights vary by preset).
- Keyword bonuses for preset “positives” (e.g., ambient for sleep) and negatives for preset “anti-genres”.
-
Final Score
final = (w_text * text_score) + (w_tempo * tempo_score) - (w_pen * penalty) + bonus- Weights tuned per preset (tempo matters more for running/dance_party).
-
Selection
- Sorted by score, deterministic tie-break with BLAKE2b (
--seed). - Per-artist cap enforced (
--artist-max). - Optional
--pairingsJSON boosts related tracks you like.
- Sorted by score, deterministic tie-break with BLAKE2b (
-
Output
- Writes
.m3u(with#EXTINF) or.xspf(XML). - Paths are relative to the playlist directory unless
--relative-tois given. --dry-runprints top candidates with score/BPM.
- Writes
You can nudge selections toward your known pairings (no secret training data).
Create a file like pairings.json:
{
"rules": [
{
"if": { "artist": "Daft Punk" },
"boost": [ { "artist": "Justice" }, { "title_contains": "Remix" } ]
},
{
"if": { "artist": "Nina Simone" },
"boost": [ { "artist": "Billie Holiday" }, { "artist": "Etta James" } ]
}
]
}Then run with --pairings /path/to/pairings.json.
- Not actual AI: no model training or adaptation.
- Garbage in → garbage out: messy tags = messy results.
- Tempo analysis is slow; prefer BPM tags when possible.
- Heuristics are opinionated; you may need to hand-tune results.
This project is licensed under the GNU GPLv3. See LICENSE for details.