Development of a Multiclass Text Classification Model to Classify Public Trust in the Government: Exploring the Lexical Features of Trust in the Philippine Context

Date of Award

7-1-2022

Document Type

Thesis

Degree Name

Master of Science in Computer Science, Straight

First Advisor

Maria Regina Justina E. Estuar, PhD

Abstract

As most governments in the world continue to face the pandemic, various policies and initiatives have been put in place in order to help control the spread of the COVID-19 outbreak. While these initiatives and interventions are taking place, a pandemic still creates a reality of risk and uncertainty. In these kinds of situations, public trust is greatly important to properly mitigate health and societal impacts of the pandemic. Given this, the study aims to develop a multi-class text classification model to classify trust of the public in the government by exploring lexical features of social media posts. Various machine learning models were created using traditional ML methods as well as advanced transformer-based models. Extra trees classifier that used TF-IDF with n-grams was the best model in the traditional approach in terms of accuracy, precision, recall and f1-score, while RoBERTa-base performed the best for the transformer approach. Furthermore, a random forests model, that used the same techniques as the traditional approach, was also created that aimed to classify the polarity of trust statements. The study also examines the lexical features of gathered trust sentiments and found that significant differences are present between the trust categories in terms of part of speech and named entities.

Share

COinS