Exploring Active Learning for Student Behavior Classification

Document Type


Publication Date



Selection of high-quality ground truth data is a critical step for machine learning. Conventionally, a human-centered strategy is utilized to label the data. While this technique provides accurate annotations of task-specific behaviors, it is difficult, costly and error-prone. One method explored to solve these problems is active learning, a model-centered approach that minimizes human involvement. In this work, we conduct an experiment to compare the performance of active learning and passive learning strategies in selecting ground truth data for a classification task to detect the incidence of task persistent behavior from students' interaction logs. Our findings suggest that active learning tends to be more effective and efficient than passive learning in achieving a certain level of performance. However, the overall performance comparison shows that passive selection for ground truth data is as effective as the active learning approach for applications with relatively small sample size.