Predicting Dropout in Online Learning Environments

Sandro Radovanović1, Boris Delibašić1 and Milija Suknović1

  1. University of Belgrade - Faculty of Organizational Sciences
    11000 Belgrade, Serbia
    {sandro.radovanovic, boris.delibasic, milija.suknovic}


Online learning environments became popular in recent years. Due to high attrition rates, the problem of student dropouts became of immense importance for course designers, and course makers. In this paper, we utilized lasso and ridge logistic regression to create a prediction model for dropout on the Open University database. We investigated how early dropout can be predicted, and why dropouts occur. To answer the first question, we created models for eight different time frames, ranging from the beginning of the course to the mid-term. There are two results based on two definitions of dropout. Results show that at the beginning AUC of the prediction model is 0.549 and 0.661 and rises to 0.681 and 0.869 at mid-term. By analyzing logistic regression coefficients, we showed that at the beginning of the course demographic features of the student and course description features are the most important variables for dropout prediction, while later student activity gains more importance.

Key words

Education Data Mining, Learning Analytics, Dropout prediction, Lasso, and Ridge Logistic Regression

Digital Object Identifier (DOI)

Publication information

Volume 18, Issue 3 (June 2021)
Year of Publication: 2021
ISSN: 1820-0214 (Print) 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Radovanović, S., Delibašić, B., Suknović, M.: Predicting Dropout in Online Learning Environments. Computer Science and Information Systems, Vol. 18, No. 3, 957–978. (2021),