NWU Institutional Repository

Towards an unsupervised morphological segmenter for isiXhosa

dc.contributor.authorMzamo, Lulamile
dc.contributor.authorHelberg, Albert
dc.contributor.authorBosch, Sonja
dc.contributor.researchID12363626 - Helberg, Albertus Stephanus Jacobus
dc.contributor.researchID24827304 - Mzamo, Lulamile
dc.date.accessioned2019-06-07T07:01:52Z
dc.date.available2019-06-07T07:01:52Z
dc.date.issued2019
dc.description.abstractIn this paper, branching entropy techniques and isiXhosa language heuristics are adapted to develop unsupervised morphological segmenters for isiXhosa. An overview of isiXhosa segmentation issues is given, followed by a discussion on previous work in automated segmentation, and segmentation of isiXhosa in particular. Two unsupervised isiXhosa segmenters are presented and compared to a random minimum baseline and Morfessor-Baseline, a standard in unsupervised word segmentation. Morfessor-Baseline outperforms both isiXhosa segmenters at 79.10% boundary identification accuracy. The IsiXhosa Branching Entropy Segmenter (XBES) performance varies depending on the segmentation mode used, with a maximum of 73.39%. The IsiXhosa Heuristic Maximum Likelihood Segmenter (XHMLS) achieves 72.42%. The study suggests that unsupervised isiXhosa morphological segmentation is feasible with better optimization of the current attempten_US
dc.identifier.citationMzamo, L. et al. 2019. Towards an unsupervised morphological segmenter for isiXhosa. Proceedings, 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), Bloemfontein, South Africa, 28-30 Jan. Article no 8704816:166-170. [https://doi.org/10.1109/RoboMech.2019.8704816]en_US
dc.identifier.issn978-1-7281-0369-3 (Online)
dc.identifier.urihttp://hdl.handle.net/10394/32603
dc.identifier.urihttps://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8704816
dc.identifier.urihttps://doi.org/10.1109/RoboMech.2019.8704816]
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectNatural language processingen_US
dc.subjectUnsupervised machine learningen_US
dc.subjectMorphological segmentationen_US
dc.subjectisiXhosaen_US
dc.titleTowards an unsupervised morphological segmenter for isiXhosaen_US
dc.typePresentationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Towards_an_unsupervised.pdf
Size:
440.78 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: