Detecting Potential Depressed Users in Twitter Using a Fine-Tuned Distilbert Model
Date of Award
12-1-2022
Document Type
Thesis
Degree Name
Master of Science in Computer Science
First Advisor
Marlene M. De Leon, PhD
Abstract
With the increased prevalence of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology such as machine learning models for social media-based assessment of mental illness. However, studies are few on the use of Transformer models, as well as Transformer-derived models, for detecting potential users in Twitter for depression. Hence, this study aims to determine how well a pre-trained DistilBERT fine-tuned on a set of tweets coming from depressed and non-depressed users can detect potential users in Twitter as having depression. After preprocessing both the training and test dataset, the Base Model (i.e. trained solely on the CLPsych 2015 Dataset) and the Mixed Model (i.e. trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets), were built using the same procedure of splitting, tokenizing, training, fine-tuning, and optimizing. Results showed the Base Model could identify potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% when evaluated using the test dataset. The Mixed Model could also detect potential users in Twitter as having depression more than half of the time as shown by its AUC score of 63% after being evaluated using the test dataset. Both models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p=0.21). These results suggest DistilBERT, when fine-tuned, may be used as an initial screening tool in detecting potential users in Twitter for depression.
Recommended Citation
Adarlo, Miguel Antonio S., (2022). Detecting Potential Depressed Users in Twitter Using a Fine-Tuned Distilbert Model. Archīum.ATENEO.
https://archium.ateneo.edu/theses-dissertations/966
