An evaluation of keyword, string similarity and very shallow syntactic matching for a university admissions processing infobot

Peter Hancox1 and Nikolaos Polatidis2

  1. School of Computer Science, University of Birmingham
    Edgbaston, Birmingham, B15 2TT, United Kingdom
    pjh@cs.bham.ac.uk
  2. Department of Applied Informatics, University of Macedonia
    156 Egnatia Street, 54006, Thessaloniki, Greece
    npolatidis@uom.edu.gr

Abstract

“Infobots” are small-scale natural language question answering systems drawing inspiration from ELIZA-type systems. Their key distinguishing feature is the extraction of meaning from users’ queries without the use of syntactic or semantic representations. Three approaches to identifying the users’ intended meanings were investigated: keywordbased systems, Jaro-based string similarity algorithms and matching based on very shallow syntactic analysis. These were measured against a corpus of queries contributed by users of a WWW-hosted infobot for responding to questions about applications to MSc courses. The most effective system was Jaro with stemmed input (78.57%). It also was able to process ungrammatical input and offer scalability.

Key words

chatbot, infobot, question-answering, Jaro string similarity, Jaro-Winkler string similarity, shallow syntactic processing

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS121202065H

Publication information

Volume 10, Issue 4 (October 2013)
Special Issue on Advances in Model Driven Engineering, Languages and Agents
Year of Publication: 2013
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Hancox, P., Polatidis, N.: An evaluation of keyword, string similarity and very shallow syntactic matching for a university admissions processing infobot. Computer Science and Information Systems, Vol. 10, No. 4, 1703-1726. (2013), https://doi.org/10.2298/CSIS121202065H