| Student attrition and delayed graduation remain significant challenges for higher education institutions, affecting both institutional performance and student outcomes. This study proposes a Student Digital Twin framework for higher education that supports early identification of academic risks through predictive analytics. The framework facilitates the implementation of a digital representation of student academic behavior by integrating two predictive models targeting early student dropout and adherence to graduation requirements. This approach evaluates Logistic Regression, Random Forest, LightGBM, and XGBoost to identify the best-performing model for analyzing student behavioral patterns. Scope of this work is to first identify key features, influencing students’ early dropouts of their university studies and the academic progression within the expected academic timeframe. The models capture the evolving academic behavior of each student using semester-level institutional data, demographic information, and predictive analytics. The early attrition prediction model was trained and validated on a real-world dataset comprising 44,147 historical records from 4,520 students, while the academic compliance model utilized 21,925 samples across 1,897 students. Both datasets were extracted from the central academic registry of the Universities in Greece. The proposed system demonstrated a predictive accuracy of 91.24\% for early dropout identification and 82.66\% for N+2 academic compliance, where N represents the normative duration of the degree program. The combination of these models yields a Student Digital Twin that monitors individual performance profiles in real-time, enabling the system to issue proactive warnings and actionable insights regarding a student's academic future. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.