pitch estimation comparison #275
Replies: 2 comments 2 replies
-
Borrowing code from the following: Surprising to me is how unstable harvest is. This is a sample from the slt voice from cmu arctic, but even on multiple datasets this behavior shows up. Other than that all the other available f0 methods seem pretty reasonable quality. I did find this example from another dataset. Again harvest is all over the place, but both crepe methods also fail to follow the pitch down near the beginning of the audio clip. I've noticed this happening on a few other audio clips from my dataset as well. My very limited to be taken with a grain of salt opinion of the f0 methods so far: dio and harvest: sometimes return an f0 around 60Hz on less than quality datasets. |
Beta Was this translation helpful? Give feedback.
-
Would this also affect I know the default was changed from |
Beta Was this translation helpful? Give feedback.
-
https://github.com/Pradeepiit/hf0
Just came across this repository which I find interesting. It shows a model that is quite a bit lighter than CREPE, and also has several examples against CREPE and pYIN highlighting various failure cases. There is also another discussion #251 talking about f0 mean filtering to possibly improve the resulting quality further.
I might try to develop a tool that uses this fork to compare the result of various f0 algorithms over a spectrogram, similar to the examples in the hf0 repository.
Beta Was this translation helpful? Give feedback.
All reactions