NWU Institutional Repository

Introducing XGL: a lexicalised probabilistic graphical lemmatiser for isiXhosa

dc.contributor.authorMzamo, Lulamile
dc.contributor.authorHelberg, Albert
dc.contributor.authorBosch, Sonja
dc.contributor.researchID12363626 - Helberg, Albertus Stephanus Jacobus
dc.date.accessioned2017-02-06T06:43:43Z
dc.date.available2017-02-06T06:43:43Z
dc.date.issued2015
dc.description.abstractIn this paper, a lexicalized probabilistic graphical lemmatiser for isiXhosa, XGL, is presented. An overview of isiXhosa lemmatisation issues is given, followed by a discussion on previous work in automated lemmatisation for isiXhosa. The paper continues to motivate for a machine learning lemmatiser for isiXhosa. IsiXhosa data used to train the lemmatiser is analyzed and the best features are identified from the analysis. The inner workings of XGL are detailed and evaluation results presented. XGL is shown to have achieved accuracy rates of 83.19% on a gold standard of word-lemma pairs, thereby outperforming similar lemmatisers such as LemmaGen's 80.6% and 73.13% from the CST lemmatiser when trained with 35000 word-lemma pairsen_US
dc.identifier.citationMzamo, L. et al. 2015. Introducing XGL: a lexicalised probabilistic graphical lemmatiser for isiXhosa. Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), 26-27 Nov. [https://doi.org/10.1109/RoboMech.2015.7359513]en_US
dc.identifier.isbn978-1-4673-7450-7 (Online)
dc.identifier.urihttp://hdl.handle.net/10394/19969
dc.identifier.urihttps://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7359513
dc.identifier.urihttps://doi.org/10.1109/RoboMech.2015.7359513
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectIsiXhosaen_US
dc.subjectNatural language processingen_US
dc.subjectMachine learningen_US
dc.subjectLemmatisationen_US
dc.titleIntroducing XGL: a lexicalised probabilistic graphical lemmatiser for isiXhosaen_US
dc.typePresentationen_US

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: