Hi!
I'd like to use the pretrained weights in waveglow, to train on a dataset with different sampling rate. When I just train tacotron and try the mel outputs on the pretrained waveglow model the audio outputs sound low-pitched.
If the frequency is fundamentally different, does it bring any benefit using the pretrained network or it would be as useful as training from scratch?
Any experiences in this??
My dataset sampling rate is 16000, in contrast to 22500 from the original LJSpeech dataset.