DOI: 10.2298/CSIS101011023K

Data Extraction and Annotation Based on Domain-specific Ontology Evolution for Deep Web

Chen Kerui1,2, Zuo Wanli1, He Fengling1, Chen Yongheng1 and Wang Ying1

  1. College of Computer Science and Technology, Jilin University
    Changchun 130012, China
  2. School of Computer Science and Technology, Changchun University of Science and Technology
    Changchun 130022, China
    Wanli Zuo wanli@jlu.edu.cn

Abstract

Deep web respond to a user query result records encoded in HTML files. Data extraction and data annotation, which are important for many applications, extracts and annotates the record from the HTML pages. We proposed an domain-specific ontology based data extraction and annotation technique; we first construct mini-ontology for specific domain according to information of query interface and query result pages; then, use constructed mini-ontology for identifying data areas and mapping data annotations in data extraction; in order to adapt to new sample set, mini-ontology will evolve dynamically based on data extraction and data annotation. Experimental results demonstrate that this method has higher precision and recall in data extraction and data annotation.

Key words

Deep Web, Data Extraction, Data Annotation, Domain Ontology, Ontology Evolution

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS101011023K

Publication information

Volume 8, Issue 3 (June 2011)
Year of Publication: 2011
ISSN: 1820-0214 (Print) 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Kerui, C., Wanli, Z., Fengling, H., Yongheng, C., Ying, W.: Data Extraction and Annotation Based on Domain-specific Ontology Evolution for Deep Web. Computer Science and Information Systems, Vol. 8, No. 3, 673-692. (2011), https://doi.org/10.2298/CSIS101011023K