Characterizing Bias in Word Embeddings Towards the Development of a Bilingual Analyzer for Detecting Implicit Gender Associations in Philippine Media Texts

Date of Award

5-1-2023

Document Type

Thesis

Degree Name

Master of Science in Data Science

First Advisor

Maria Regina Justina E. Estuar, PhD

Abstract

Gender studies scholars in the Philippines argue that texts from the country’s mass media institutions have been historically complicit in promoting stereotypical gender associations. To mitigate such unfair constructions, natural language processing scholars work on computa- tional models that automate the analysis of gender bias in language. However, not only has such work been absent in the Philippine context, but scholarly efforts have also not yet considered discourses surround- ing non-heterosexual persons. This study built applied word embed- ding association analysis methods in characterizing gendered associa- tions implicitly expressed in documents derived from Philippine mass media. Results show that both corpus-level and word-level biases exist within these documents. At the corpus level, Filipino texts were found to link verbs and action to the male while linking nouns, objects, and social roles to the female. At the semantic level, analyses showed that implicit biases in Philippine mass media construct the heterosexual male as a hedonistic fool, the heterosexual female in terms of her body and physi- cal appearance, the non-heterosexual male as a delusional fool, and the non-heterosexual female in terms of pornographic fetishes. To help min- imize these representations in local media, a tool was also developed to expedite some aspects of the gender and/or sexuality bias review process.

Share

COinS