Advanced Machine Learning Final Project

Training (`train.py`)

Overview

train.py is a Python script designed for training audio tagging models. It supports various data transformations, augmentations, and utilizes a custom training loop encapsulated within a Trainer class. The script is highly configurable through command-line arguments, allowing for flexible experimentation with different model architectures, data preprocessing techniques, and training parameters.

Installation

Ensure that all required Python packages are installed.

pip install -r requirements.txt

Usage

python train.py --data_dir "path/to/data" --model_class_name "ModelClassName" --epochs 20 --learning_rate 0.001 --batch_size 32

Configuration Parameters

--data_dir: Directory containing the audio data.
--train_annotations: Path to the training annotations file.
--val_annotations: Path to the validation annotations file.
--test_annotations: Path to the test annotations file.
--sample_rate: Sample rate for audio processing.
--target_length: Target length of audio samples in seconds.
--batch_size: Batch size for training.
--num_workers: Number of workers for data loading.
--apply_augmentations: Apply pitch shift and time-stretch augmentations.
--model_class_name: Class name of the model to be used.
--learning_rate: Learning rate for training.
--epochs: Number of training epochs.
--model_path: Directory to save the trained model.

HPC

Setup

login to hpc
Git setup on HPC:
1. copy public key to clipboard and go to github settings and create new ssh key Run cat .ssh/id_rsa.pub from inside home directory and copy output to clipboard
2. Clone our repo git clone [email protected]:syeon0928/Tagging-Music-Sequences.git
3. Update repo as usual (see below under Git commands)
sbatch setup_conda_env to setup environment (install packages etc)
sbatch run_jupyter_notebook.sh to run the jupyternotebook
cat jupyter-notebook-{your job number}.log to show output of running script
copy ssh command from log file and run on another terminal ex) ssh -N -L 8248:desktop2:8248 [[email protected]]
open the URL from the log file (last link)

Tagging music sequences.

Music plays an important role in our lives, while the landscape of contemporary music is vast. In order to understand music taste and build recommender systems for music, we need to learn to tag music first. In this project, we want to build a classifier that can tag music pieces with a genre or category after listening to an arbitrary long example. For this, we want to consider the following datsets:

GTZAN
The MagnaTagATune Dataset (MTAT), and
for advanced studies: the Free Music Archive (FMA).

Main goals:

Research literature about sound and music pre-processing, transformation, and representation. What type of pre-processing is best for music pieces, i.e. what is the state-of-the-art of spectrograms vs. raw waveform?
Train an encoding model (deep recurrent and/or CNN network) with appropriate representation to classify sequences of music pieces. Your options are vast as you can consider all the tools that we covered in class: GRUs? CNNs? Variational Encoders? Combinations thereof? Make use of recent examples from literature! Can you identify an architecture (and meta-parameter settings) that can be trained to tag/classify considerably well?
Study the performance for edge cases, such as particularly short input sequences or music pieces for rare genres/categories. Can you identify characteristics of such edge cases that make performance particularly high or low?
Identify differences in quantitative performance and qualitative characteristics (look into how your model decides in edge cases) between different pre-processing options.

Optional:

Build your music tagger by training only on one of the datasets and comparing generalisation on the other. Given that you took good care of appropriate representation and pre-processing for both, can you explain the performance differences?
Look into pre-trained options (e.g. from paperswithcode.com) and finetune your extended models. How is performance (quantitative and qualitative) different?

Name		Name	Last commit message	Last commit date
Latest commit History 267 Commits
assets		assets
data		data
evaluation_results		evaluation_results
models		models
notebooks		notebooks
report		report
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
evaluate.sh		evaluate.sh
evaluate_gtzan.sh		evaluate_gtzan.sh
evaluate_mtat.sh		evaluate_mtat.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh
train_yc.py		train_yc.py
train_yc.sh		train_yc.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced Machine Learning Final Project

Training (`train.py`)

Overview

Installation

Usage

Configuration Parameters

HPC

Setup

Tagging music sequences.

Main goals:

Optional:

About

Releases

Packages

Contributors 4

Languages

syeon0928/tagging-music-sequences

Folders and files

Latest commit

History

Repository files navigation

Advanced Machine Learning Final Project

Training (train.py)

Overview

Installation

Usage

Configuration Parameters

HPC

Setup

Tagging music sequences.

Main goals:

Optional:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Training (`train.py`)

Packages