Oversampling Facial Motion Features Using the Variational Autoencoder to Estimate Oro-facial Dysfunction Severity

Document Type

Conference Proceeding

Publication Date



Class imbalance, which negatively affects classification model performance, is a common problem with machine learning. Various oversampling methods have been developed as potential solutions to compensate for imbalanced data. SMOTE is one of the more common methods employed. However, deep generative models such as the variational autoencoder are showing promise as alternatives to traditional oversampling methods. This study investigated the potential of variational autoencoders in learning the distribution of the minority class and producing new observations of facial motion features extracted from an imbalanced medical dataset as well as to see the effects of oversampling before and after the train-test split. The effectiveness of the variational autoencoder was compared to SMOTE in increasing ordinal classification performance across the metrics of accuracy, accuracy±1, inter-rater reliability, specificity, and sensitivity with no oversampling serving as the baseline. The results show that the variational autoencoder has potential as an oversampling method for facial motion features in the context of oro-facial dysfunction estimation. Oversampling prior to the train-test split was also shown to improve classification performance.