HMC-ReliefF: Feature Ranking for Hierarchical Multi-label Classification

Ivica Slavkov1, 2, Jana Karcheska3, Dragi Kocev4, 5 and Sašo Džeroski4, 5

  1. Centre for Genomic Regulation (CRG)
    Barcelona, Spain
  2. Universitat Pompeu Fabra (UPF)
    Barcelona, Spain
    ivica.slavkov@crg.eu
  3. University Ss. Cyril and Methodius
    Skopje, Macedonia
    j.karcheska@gmail.com
  4. Department of Knowledge Technologies, Jožef Stefan Institute
    Ljubljana, Slovenia
  5. Jožef Stefan International Postgraduate School
    Ljubljana, Slovenia
    {dragi.kocev,saso.dzeroski}@ijs.si

Abstract

In machine learning, the growing complexity of the available data poses an increased challenge for its analysis. The rising complexity is both in terms of the data becoming more high-dimensional as well as the data having a more intricate structure. This emphasizes the need for developing machine learning algorithms that are able to tackle both the high-dimensionality and the complex structure of the data. Our work in this paper focuses on the development and analysis of the HMCReliefF algorithm, which is a feature relevance (ranking) algorithm for the task of Hierarchical Multi-label Classification (HMC). The basis of the algorithm is the RReliefF algorithm for regression that is adapted for hierarchical multi-label target variables. We perform an extensive experimental investigation of the HMC-ReliefF algorithm on several datasets from the domains of image annotation and functional genomics. We analyse the algorithm’s performance in terms of accuracy in a filterlike setting and also in terms of ranking stability for various parameter values. The results show that the HMC-ReliefF can successfully detect relevant features from the data that can be further used for constructing accurate predictive models. Additionally, the stability analysis helps to determine the preferred parameter values for obtaining not just accurate, but also a stable algorithm output.

Key words

feature selection, feature ranking, structured data, hierarchical multilabel classification, ReliefF

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS170115043S

Publication information

Volume 15, Issue 1 (January 2018)
Year of Publication: 2018
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Slavkov, I., Karcheska, J., Kocev, D., Džeroski, S.: HMC-ReliefF: Feature Ranking for Hierarchical Multi-label Classification. Computer Science and Information Systems, Vol. 15, No. 1, 187–209. (2018), https://doi.org/10.2298/CSIS170115043S