- download and extract the file train / test dataset 28spk https://datashare.is.ed.ac.uk/handle/10283/2791
- write address location of files in txt file
/home/nas/clean/train.txt
/home/nas/noise/train.txt
/home/nas/clean/test.txt
/home/nas/noise/test.txt
- 48khz 데이터들을 16khz로 resampling해준다. 필자는 16khz로 사용했다.
python data_feature/demand_data_get_feature.py --clean_train /home/nas/clean/train.txt --noise_train /home/nas/noise/train.txt --clean_test /home/nas/clean/test.txt --noise_test /home/nas/noise/test.txt --train_save_path /home/nas/train/save/ --test_save_path /home/nas/test/save --fs 16
- or
sh make_data_feature.sh
python train_DCUnet_jsdr_demand.py --train_data_root_folder /home/nas/train/save/ --val_data_root_folder /home/nas/test/save --gpu 0 --modelsave_path model/save/path --snr 0 --exp_day 0101 --batch_size 20 --frame_num 128 --learning_rate 0.0001 --fs 48
- or
sh train.sh
frame_num
is time value of STFT.- recommend
frame_num = 128 or 256 like 2's power
- if you don't like it, change model padding, stride, kernel etc.
python test_DCUnet_jsdr_demand.py --fs 48 --test_model model/path.pth --test_data_root_folder /home/nas/test/audio --test_data_output_path /home/nas/test/output
- or
sh test.sh
test_data_root_folder
is folder that has .wav audio files.- The length of the output audio is limited. To solve this, go into
dataset/demand_dataset_test_librosa.py
and add line 107-113 paragraphs appropriately to the length.
- demand noise dataset download https://zenodo.org/record/1227121#.X1Ytv3kzaUk
- To do, clean data + demand noise dataset
cd data_augment python data_aug_demand_dataset.py --clean_train_txt /home/nas/clean/train.txt --noise_txt /home/nas/demand/noise.txt --save_path /home/nas/save/path/ --snr 0 --fs 48