Exam project in the Deep Learning course at DTU.
A GRU-network model for next-step prediction of notes in the Nottingham folk-melody dataset.
The model is also expanded to feeding the previous prediction from the output-network as input to the GRU-input-network, so as to be able to generate new sequences of notes from an initial one.
The final documentation of the project and results can be found here.
From ABC-format to Music21 objects to zero-padded one-hot encoded vectors for each note (pitch and duration) in each melody (list of lists of vectors --> numpy array X=[M, N, F]=[Melodies, Notes, Features]). Here all notes in all melodies are represented by a duration tensor, Xd, with 14 features and a pitch tensor, Xd, with 35 features.
Regularizing the GRU networks reduce overfitting, as seen by less span between training and validation accuracy curves.
Dropout: By leaving out notes along the melodies, a lossy noise and therefore a completion task is introduced to the models, so during training the next-step prediction will rely more on the previous GRU activations ht-1 and the horizontal connections will be enhanced to make up for the missing input.
Prediction input: By feeding in the previous prediction, a stronger loss signal will traverse across the horizontal connections and vanishing gradients can be avoided.