Skip to content

[BUG] tdpsola does not work properly for low beta values #33

@Looki2000

Description

@Looki2000

Describe the bug
When changing pitch of a voice with tsm.tdpsola by low beta factor, pitch stays the same and you can hear clicking artifacts. There are pitch shifting plugins that use TD-PSOLA and allow for even lower pitch changes, so I don't think this is a limitation of the algorithm.

To Reproduce
Code to reproduce the behavior:

import numpy as np
import librosa
import soundfile as sf
import matplotlib.pyplot as plt
import pytsmod as tsm

n_fft = 1024
hop_length_factor = 4

file_path = "audio.flac"

print("Loading audio file...")
audio, sr = librosa.load(file_path, sr=None, mono=True)
print(sr)

hop_length = n_fft // hop_length_factor


print("pyin")
f0, _, _= librosa.pyin(
    audio,
    sr=sr,
    fmin=librosa.note_to_hz("C2"),
    fmax=librosa.note_to_hz("C7"),
    frame_length=n_fft,
    hop_length=hop_length,
)


mask = np.isnan(f0)

# linearly interpolate pitch in place of nans
f0[mask] = np.interp(np.flatnonzero(mask), np.flatnonzero(~mask), f0[~mask])


audio_stft = librosa.stft(audio, n_fft=n_fft, hop_length=hop_length)

f0_stft = f0_stft = f0 * n_fft/sr

# plot spectrogram and f0
spect = librosa.amplitude_to_db(np.abs(audio_stft), ref=np.max)
fig, ax = plt.subplots()
img = librosa.display.specshow(spect, x_axis="time", ax=ax, sr=sr, hop_length=hop_length)
fig.colorbar(img, ax=ax, format="%2.f")

ax.plot(librosa.times_like(f0_stft, sr=sr, hop_length=hop_length), f0_stft, label="f0", color="cyan")
plt.show()


audio = tsm.tdpsola(audio, sr, f0, beta=0.5, p_hop_size=hop_length, p_win_size=n_fft)

sf.write("tdpsola test.wav", audio, sr)

Desktop:

  • OS: Windows 11
  • Python version: Python 3.12.4
  • PyTSMod version: 0.3.8

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions