Document Type


Publication Date



Finding innovative ICT solutions to enhance the Philippines’ health sector is part and parcel of the Philippine eHealth Strategic Framework and Plan 2020 program. This study sees the opportunity of using collected Twitter data to create a model that processes tweets to produce a dataset that may be relevant in the field of epidemiology and infodemiology. Through the collection of relevant tweets, future studies may make use of the output of this research for various purposes, such as the improvement of epidemiological systems of the Department of Health in support of the eHealth strategy. In this study, we used the Naïve-Bayes classification model, an efficient text classifier, to create a model that determines whether a tweet is “infodemiological” or not. From the collected 18,044 tweets, we have narrowed it down to 1,090 tweets (6.04%) that can be used in epidemiology. Using this as a dataset for training and testing, the model was able to classify 79.91% of tweets correctly. This research shows that it is indeed feasible to collect and classify enough infodemiological tweets in the Filipino language, which in turn can be used for future infodemiological studies.