A BCNN prediction pipeline to discover mosquito sounds from audio.
Model training, validation and testing strategy, and data available on: HumBugDB.
- Documentation for running this system over HumBug's dashboard available here.
By Ivan Kiskin. Contact [email protected]
for enquiries or suggestions.
git clone https://github.com/HumBug-Mosquito/DatabaseAccess.git
Requirements:
condarequirements.txt
, piprequirements.txt
.
If you experience trouble with the installation of dependent packages, e.g. numba
decorators for the latest version of librosa
, you may try:
conda install -c anaconda keras-gpu
conda install -c conda-forge librosa=0.7.2
pip install numba==0.48
predict.py
is a command-line utility which accepts the arguments as given in python predict.py -h
.
From the directory Code/lib
, execute in the command line python predict.py [Optional arguments] rootFolderPath audio_format
.
Species prediction is executed in the same fashion, using python predict_species.py [Optional arguments] rootFolderPath audio_format
instead.
Required parameters are the source directory rootFolderPath
which contains audio files of format audio_format
. Any files of that file format in any subdirectory will be analysed. Outputs are written to the optional argument --dir_out
. The directory structure of the input will be mirrored with the root input directory replaced by --dir_out
. If left blank, outputs are written to the same folders which contain audio. For a full list of optional parameters run python predict.py -h
.
By default, the model outputs a text file of mosquito candidates with rows of the form start_time stop_time probability predictive_entropy mutual_information
. If you would like to generate an audio file which concatenates all of the detected segments to a new audio file, and parse meta labels to this file, set --to_dash = True
. The labels were designed for import to Audacity using the label import function in Audacity. The user has three options for visualising predictions:
- Audacity: load the original audio
filename.wav
, and import corresponding label predictionsfilename+model_meta_information.txt
- Audacity: load the detected mosquito candidates under
filename_mozz_pred.wav
, and import corresponding label predictionsfilename_mozz_pred.txt
- Any media player: load the generated mp4 video with mosquito candidates under
filename_mozz_pred.mp4
By default, the model outputs a text file of mosquito candidates with rows of the form start_time stop_time probability predictive_entropy mutual_information class_label
. Using prediction option out_method=per_window
will output predictions over every feature window in the data, generating class probability outputs that sum to 1. out_method=single
will output a single prediction consistining of probabilities per class over the entire audio input signal. These may be useful for shorter input signals. The labels were likewise designed for import with Audacity as follows:
- Audacity: load the original audio
filename.wav
, and import corresponding label predictionsfilename+model_meta_information+species+out_method.txt
Code/lib/
contains predict.py
, util.py
, and util_dashboard.py
Code/data/
contains some example audio files to verify the model is working as intended.
Code/models/
contains a wide range of models that have been trained. By default, the model which performed best on our testbed is used.
If you are accessing this repo after discovering our ECML publication, Automatic Acoustic Mosquito Tagging with Bayesian Neural Networks, and you wish to exactly replicate the experiments for the model in the paper, please see following documentation. Since acceptance in April 2021, the models and their training framework have been upgraded and improved. The latest model is included here as Code/models/BNN/neurips_2021_humbugdb_keras_bnn_best.hdf5
, whereas the ECML paper describes Code/models/BNN/Win_40_Stride_5_CNN_log-mel_128_norm_Falseheld_out_test_manual_v2_low_epoch.h5
. We strongly encourage you to visit HumBugDB for the most up-to-date data and training strategy.
- ECML-PKDD model training, validation, testing (deprecated)
- Database access (deprecated). Please visit latest version on: HumBugDB.