Integrating Instance-level and Attribute-level Knowledge into Document Clustering

Jinlong Wang1, 3, Shunyao Wu1, Gang Li2 and Zhe Wei4, 5

  1. School of Computer Engineering, Qingdao Technological University
    266033 Qingdao, China
  2. School of Information Technology, Deakin University
    3125, Victoria, Australia
  3. Medical College of Qingdao University
    266021 Qingdao, China
  4. State Key Laboratory of CAD&CG, Zhejiang University
    310027 Hangzhou, China
  5. SANYHE International Holding Co., Ltd,
    110027 Shenyang, China


In this paper, we present a document clustering framework incorporating instance-level knowledge in the form of pairwise constraints and attribute-level knowledge in the form of keyphrases. Firstly, we initialize weights based on metric learning with pairwise constraints, then simultaneously learn two kinds of knowledge by combining the distance-based and the constraint-based approaches, finally evaluate and select clustering result based on the degree of users’ satisfaction. The experimental results demonstrate the effectiveness and potential of the proposedmethod.

Key words

document clustering, pairwise constraints, keyphrases

Digital Object Identifier (DOI)

Publication information

Volume 8, Issue 3 (June 2011)
Year of Publication: 2011
ISSN: 1820-0214 (Print) 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Wang, J., Wu, S., Li, G., Wei, Z.: Integrating Instance-level and Attribute-level Knowledge into Document Clustering. Computer Science and Information Systems, Vol. 8, No. 3, 635-651. (2011)