Towards an unsupervised morphological segmenter for isiXhosa
| dc.contributor.author | Mzamo, Lulamile | |
| dc.contributor.author | Helberg, Albert | |
| dc.contributor.author | Bosch, Sonja | |
| dc.contributor.researchID | 12363626 - Helberg, Albertus Stephanus Jacobus | |
| dc.contributor.researchID | 24827304 - Mzamo, Lulamile | |
| dc.date.accessioned | 2019-06-07T07:01:52Z | |
| dc.date.available | 2019-06-07T07:01:52Z | |
| dc.date.issued | 2019 | |
| dc.description.abstract | In this paper, branching entropy techniques and isiXhosa language heuristics are adapted to develop unsupervised morphological segmenters for isiXhosa. An overview of isiXhosa segmentation issues is given, followed by a discussion on previous work in automated segmentation, and segmentation of isiXhosa in particular. Two unsupervised isiXhosa segmenters are presented and compared to a random minimum baseline and Morfessor-Baseline, a standard in unsupervised word segmentation. Morfessor-Baseline outperforms both isiXhosa segmenters at 79.10% boundary identification accuracy. The IsiXhosa Branching Entropy Segmenter (XBES) performance varies depending on the segmentation mode used, with a maximum of 73.39%. The IsiXhosa Heuristic Maximum Likelihood Segmenter (XHMLS) achieves 72.42%. The study suggests that unsupervised isiXhosa morphological segmentation is feasible with better optimization of the current attempt | en_US |
| dc.identifier.citation | Mzamo, L. et al. 2019. Towards an unsupervised morphological segmenter for isiXhosa. Proceedings, 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), Bloemfontein, South Africa, 28-30 Jan. Article no 8704816:166-170. [https://doi.org/10.1109/RoboMech.2019.8704816] | en_US |
| dc.identifier.issn | 978-1-7281-0369-3 (Online) | |
| dc.identifier.uri | http://hdl.handle.net/10394/32603 | |
| dc.identifier.uri | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8704816 | |
| dc.identifier.uri | https://doi.org/10.1109/RoboMech.2019.8704816] | |
| dc.language.iso | en | en_US |
| dc.publisher | IEEE | en_US |
| dc.subject | Natural language processing | en_US |
| dc.subject | Unsupervised machine learning | en_US |
| dc.subject | Morphological segmentation | en_US |
| dc.subject | isiXhosa | en_US |
| dc.title | Towards an unsupervised morphological segmenter for isiXhosa | en_US |
| dc.type | Presentation | en_US |
