NWU Institutional Repository

Towards an unsupervised morphological segmenter for isiXhosa

Loading...
Thumbnail Image

Date

Authors

Mzamo, Lulamile
Helberg, Albert
Bosch, Sonja

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

In this paper, branching entropy techniques and isiXhosa language heuristics are adapted to develop unsupervised morphological segmenters for isiXhosa. An overview of isiXhosa segmentation issues is given, followed by a discussion on previous work in automated segmentation, and segmentation of isiXhosa in particular. Two unsupervised isiXhosa segmenters are presented and compared to a random minimum baseline and Morfessor-Baseline, a standard in unsupervised word segmentation. Morfessor-Baseline outperforms both isiXhosa segmenters at 79.10% boundary identification accuracy. The IsiXhosa Branching Entropy Segmenter (XBES) performance varies depending on the segmentation mode used, with a maximum of 73.39%. The IsiXhosa Heuristic Maximum Likelihood Segmenter (XHMLS) achieves 72.42%. The study suggests that unsupervised isiXhosa morphological segmentation is feasible with better optimization of the current attempt

Description

Citation

Mzamo, L. et al. 2019. Towards an unsupervised morphological segmenter for isiXhosa. Proceedings, 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), Bloemfontein, South Africa, 28-30 Jan. Article no 8704816:166-170. [https://doi.org/10.1109/RoboMech.2019.8704816]

Endorsement

Review

Supplemented By

Referenced By