A Comparative Analysis of Deep Learning Methods for the Generation of Music Based on Traditional Filipino Music Features

Date of Award

12-1-2023

Document Type

Thesis

Degree Name

Master of Science in Data Science

First Advisor

Andrei D. Coronel, PhD

Abstract

The field of applied deep learning has grown exponentially over the past few years due to an increase in the computational power of modern graphics processing units and the discovery of new architectures to produce state-of-the-art results in a range of different tasks. One common application of deep learning has been in the domain of music generation. Deep learning has been used to generate music based on a training dataset comprising features of different types or genres of music.

The objective of this study is to generate music that displays characteristics of traditional Filipino music using sequence models. In doing so, the most relevant models will be chosen and the specific quantitative and qualitative features of traditional Filipino music will be studied and tested. For this study, three sub-objectives are also accomplished to further expound on the main objective. Firstly, to determine the features that characterize traditional Filipino music. Secondly, to determine how machine learning algorithms such as RNN, LSTM, and GRU generate melodies from a training dataset. Lastly, to determine what attributes of RNN, LSTM, and GRU facilitate acceptable results in the context of generating melodies that are similar to traditional Filipino music.

In accordance with the most commonly used models in the literature, RNN, LSTM, and GRU are going to be used and compared in the generation of melodies based on traditional Filipino music. Moreover, in accordance with commonly used tools used both in the literature and in industry, Tensorflow will be used as the primary software tool for configuring and training the deep learning models. MIDI will be used as the file format for dealing with the training and the generated dataset. In-depth discussions of the inner workings of each used model will also be found in this paper.

It was found that, using the KS test, traditional Filipino music found in the training dataset is statistically significantly different from Irish, Japanese, and American traditional music. Secondly, it was found that sequence models model music by creating statistical distributions of music features based on the training dataset. Thirdly, LSTM and GRU were found to behave similarly in the context of generating sounds based on traditional Filipino music features. The RNN behaved differently in terms of its produced music feature distributions, however a KS test with the training dataset showed that the produced tracks of all 3 models are statistically significantly similar to the training dataset. Lastly, it was shown that the loss curves of all models during training asymptote towards 0.

Share

COinS