Establishing the reliability of natural language processing evaluation through linear regression modelling

Eiselen, Ernst Roald

Establishing the reliability of natural language processing evaluation through linear regression modelling

Files

Eiselen_ER.pdf (1.96 MB)

Date

2013

Authors

Eiselen, Ernst Roald

Researcher ID

10215484 - Van Huyssteen, Gerhardus Beukes (Supervisor)

Supervisors

Van Huyssteen, Gerhard B
Hickl, L.T.

Publisher

North-West University

Abstract

Determining the quality of natural language applications is one of the most important aspects of technology development. There has, however, been very little work done on establishing how well the methods and measures represent the quality of the technology and how reliable the evaluation results presented in most research are. This study presents a new stepwise evaluation reliability methodology that provides a step-by-step framework for creating predictive models of evaluation metric reliability that take into account inherent evaluation variables. These models can then be used to predict how reliable a particular evaluation will be prior to doing an evaluation, based on the variables that are present in the evaluation data. This allows evaluators to predict the reliability of the evaluation prior to doing the evaluation and adjusting the evaluation data to ensure reliable results. Furthermore, this permits researchers to compare results when the same evaluation data is not available. The new methodology is firstly applied to a well-defined technology, namely spelling checkers, with a detailed discussion of the evaluation techniques and statistical procedures required to accurately model an evaluation. The spelling checker evaluations are investigated in more detail to show how individual variables affect the evaluation results. Finally, a predictive regression model for each of the spelling checker evaluations is created and validated to verify the accuracy of its predictive capability. After performing the in-depth analysis and application of the stepwise evaluation reliability methodology on spelling checkers, the methodology is applied to two more technologies, namely part of speech tagging and named entity recognition. These validation procedures are applied across multiple languages, specifically Dutch, English, Spanish and Iberian Portuguese. Performing these additional evaluations shows that the methodology is applicable to a broader set of technologies across multiple languages.

Description

Thesis (PhD (Linguistics and Literary Theory))--North-West University, Potchefstroom Campus, 2013.

Keywords

Evaluation, methodology, natural language processing, reliability, regression modelling

URI

http://hdl.handle.net/10394/9650

Collections

Humanities

Full item page

Establishing the reliability of natural language processing evaluation through linear regression modelling

Files

Date

Authors

Researcher ID

Supervisors

Journal Title

Journal ISSN

Volume Title

Publisher

Record Identifier

Abstract

Sustainable Development Goals

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By