Imbalanced Data Classification Based on Hybrid Resampling and Twin Support Vector Machine
- School of data science and computer science
Sun Yat-sen University, Guangzhou, China
caolu20001742@163.com, hongsh01@gmail.com - School of Computer Science
University of Adelaide, Australia - School of Information Engineering
Wuyi University, Jiangmen, China
Abstract
Imbalanced datasets exist widely in real life. The identification of the minority class in imbalanced datasets tends to be the focus of classification. As a variant of enhanced support vector machine (SVM), the twin support vector machine (TWSVM) provides an effective technique for data classification. TWSVM is based on a relative balance in the training sample dataset and distribution to improve the classification accuracy of the whole dataset, however, it is not effective in dealing with imbalanced data classification problems. In this paper, we propose to combine a re-sampling technique, which utilizes oversampling and under-sampling to balance the training data, with TWSVM to deal with imbalanced data classification. Experimental results show that our proposed approach outperforms other state-of-art methods.
Key words
over-sampling, under-sampling, imbalanced dataset, TWSVM, classification
Digital Object Identifier (DOI)
https://doi.org/10.2298/CSIS161221017L
Publication information
Volume 20, Issue 1 (January 2023)
Year of Publication: 2023
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium
Full text
Available in PDF
Portable Document Format
How to cite
Cao, L., Shen, H.: Imbalanced Data Classification Based on Hybrid Resampling and Twin Support Vector Machine. Computer Science and Information Systems, Vol. 20, No. 1, 579–595. (2023), https://doi.org/10.2298/CSIS161221017L