Missing Data Imputation in Cardiometabolic Risk Assessment: A Solution Based on Artificial Neural Networks

Dunja Vrbaški1, Aleksandar Kupusinac1, Rade Doroslovački1, Edita Stokić2 and Dragan Ivetić1

  1. University of Novi Sad, Faculty of Technical Sciences
    Trg Dositeja Obradovića 6, 21000 Novi Sad, Republic of Serbia
    {dunja.vrbaski, sasak, rade.doroslovacki, ivetic}@uns.ac.rs
  2. University of Novi Sad, Faculty of Medicine
    Hajduk Veljkova 3, 21000 Novi Sad, Republic of Serbia
    edith@sezampro.rs

Abstract

A common problem when working with medical records is that some measurements are missing. The simplest and the most common solution, especially in machine learning domain, is to exclude records with incomplete data. This approach produces datasets with reduced statistical power and can even lead to biased or erroneous final results. There are, however, many proposed imputing methods for missing data. Although some of them, such as multiple imputation, are mature and well researched, they can be prone to misuse and are not always suitable for building complex frameworks. This paper explores neural networks as a potential tool for imputing univariate missing laboratory data during cardiometabolic risk assessment, comparing it to other simple methods that could be easily set up and used further in building predictive models. We have found that neural networks outperform other algorithms for diverse fraction of missing data and different mechanisms causing their missingness.

Key words

missing data, cardiometabolic risk, artficial neural networks

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS190710003V

Publication information

Volume 17, Issue 2 (June 2020)
Year of Publication: 2020
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Vrbaški, D., Kupusinac, A., Doroslovački, R., Stokić, E., Ivetić, D.: Missing Data Imputation in Cardiometabolic Risk Assessment: A Solution Based on Artificial Neural Networks. Computer Science and Information Systems, Vol. 17, No. 2, 379–401. (2020), https://doi.org/10.2298/CSIS190710003V