Replies: 1 comment 2 replies
-
I have been training models for the last 3 days. I started from around 50 10s samples and after 3000 epochs the results were "not great", then expanded the dataset to 650 samples, gave it another 1000 epochs and it started improving. Then expanded even more to 1135 samples now and overnight at epoch 5000 it became much better, especially in edge cases. Screamed vocals became better as the data added to the dataset also contained that and seemed to help. Currently training a 20 voices base set from scratch (no initial weights) on 26GB of data (3552 samples)for 20 different voices and after 600 epochs is already starting to clean up, faster than with the previous, so larger dataset seems to help. What also helps is a wide variety of samples in different situations and for them to be "clean" (no echo, no background noise). That's my 2 cents. I'm not an ML expert by any means so if anyone has better info, I'd love to hear it. |
Beta Was this translation helpful? Give feedback.
-
I have access to a massive database of voices for the Polish language. I'm wondering if training a base model on these voices and then using it for training specific cases will provide any benefit? Is the model spacious enough to make a difference?
Beta Was this translation helpful? Give feedback.
All reactions