This repository contains tools and scripts for processing and analysing symbolic musical textures using a Variational Autoencoder (VAE) architecture. The work builds upon the model proposed in polyphonic-chord-texture-disentanglement, focusing on the representation and analysis of musical textures derived from annotated MIDI data.
All examples and experiments in this repository are based on the Commu Dataset. This dataset includes:
- A collection of
.midfiles with symbolic musical data. - A corresponding metadata file (
commu_meta.csv) containing annotations such as chord progressions.
The metadata CSV must contain a column named chord_progressions, where each row is a string-encoded chord sequence. Each entry is indexed by a unique track ID that matches the name of the corresponding MIDI file.
All MIDI preprocessing functions are implemented in utilProcessing.py.
To convert a folder of .mid files and associated metadata into .npz files (NumPy archive format), use the script processMidiPath.py.
The resulting .npz files contain the following arrays:
beat— shape(n, 6), dtypeint32chord— shape(n, 14), dtypefloat64melody— shape(n, 8), dtypeint32bridge— shape(n, 8), dtypeint32piano— shape(n, 8), dtypeint32
To analyse the quality of the latent representation learned by the model, we compute loss components and latent vectors for each .npz file using the script calc_latent_loss.py.
This script outputs:
z_chd– latent vector associated with the chord encoding.z_txt– latent vector associated with the texture encoding.kl_loss– total Kullback–Leibler divergence between posterior and prior.kl_chd– KL divergence for the chord latent variable.kl_rhy– KL divergence for the rhythm latent variable.final_loss– total reconstruction loss of the VAE.
These outputs can be used to:
- Study how well the model captures harmonic and textural features.
- Visualise latent spaces using dimensionality reduction techniques (e.g. UMAP).
- Compare reconstruction quality across different musical textures.
+------------------+ +----------------+
| midifile | | metadata.csv |
+--------+---------+ +--------+-------+
| |
v v
+------------------+ +----------------+
| midiFileTo4bin | | get_fund |
+--------+---------+ +--------+-------+
| |
+----------+-----------+ |
| | |
v v |
+--------+ +-----------+<-------+
| piano | | chord |
+--------+ +-----------+
- The dataset for the VAE model was generated using the notebook
FilterCommuDataset - The folder COMMUDataset was included in the project. The
midiFilesfolder are the originalrawdata and thenpzFilesfolder the preocessed data using the functionprocessMidiBatch.createBatches.pyallows the creation of batches data stores in thebatchesfolder. - A new function called
processMidiBatch.pywas created to process the raw.middata into.npzdata.