Towards an unsupervised morphological segmenter for isiXhosa

View/ Open
Date
2019Author
Mzamo, Lulamile
Helberg, Albert
Bosch, Sonja
Metadata
Show full item recordAbstract
In this paper, branching entropy techniques and
isiXhosa language heuristics are adapted to develop unsupervised
morphological segmenters for isiXhosa. An overview of isiXhosa
segmentation issues is given, followed by a discussion on previous
work in automated segmentation, and segmentation of isiXhosa
in particular. Two unsupervised isiXhosa segmenters are
presented and compared to a random minimum baseline and
Morfessor-Baseline, a standard in unsupervised word
segmentation. Morfessor-Baseline outperforms both isiXhosa
segmenters at 79.10% boundary identification accuracy. The
IsiXhosa Branching Entropy Segmenter (XBES) performance
varies depending on the segmentation mode used, with a
maximum of 73.39%. The IsiXhosa Heuristic Maximum
Likelihood Segmenter (XHMLS) achieves 72.42%. The study
suggests that unsupervised isiXhosa morphological segmentation
is feasible with better optimization of the current attempt
URI
http://hdl.handle.net/10394/32603https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8704816
https://doi.org/10.1109/RoboMech.2019.8704816]