Predicting undernutrition among elementary schoolchildren in the Philippines using machine learning algorithms

Document Type

Article

Publication Date

2022

Abstract

Objectives

This study aimed to compare the accuracy of four machine-learning (ML) algorithms, using two classification schemes, to predict undernutrition based on individual and household risk factors.

Methods

Data on public-school children were collected from a rural province (310 children) and a highly urbanized city (308 children) in the Philippines using 24-h dietary recalls and a household socioeconomic and demographic survey. Children's nutritional risk was classified based on acceptable macronutrient distribution ranges (AMDRs) developed by the National Academy of Medicine (NAM) and Philippine Dietary Reference Intakes (PDRIs). Four algorithms (random forest, support-vector machine, linear discriminant analysis, and logistic regression) predicted undernutrition in the sample, and their accuracy, sensitivity, and specificity were compared. Predictions were also compared with the national school feeding program's anthropometric classifications.

Results

The prevalence of undernutrition was greater under NAM AMDRs (82.67%) compared with PDRI AMDRs (78.71%). Random forest was the most accurate ML algorithm (78.55%), able to predict undernutrition based on household expenditures, child and household age, food insecurity, and dietary diversity. Compared with anthropometric classification (213 children), AMDRs classified more children as at risk for inadequate dietary intake (477 children).

Conclusions

The random forest algorithm performed best in predicting undernutrition among Filipino elementary schoolchildren, although results could be improved with bootstrap aggregation. The AMDR classification shows potential for targeting feeding beneficiaries. However, local dietary culture should be considered in the development of nutrition interventions. Government use of big-data techniques such as ML must also address underrepresentation in health data collected from and accessible to poor populations or risk further marginalizing them.

Share

COinS