Unsupervised acoustic model training: comparing South African English and isiZulu

Kleynhans, Neil; De Wet, Febe; Barnard, Etienne

Unsupervised acoustic model training: comparing South African English and isiZulu

dc.contributor.author	Kleynhans, Neil
dc.contributor.author	De Wet, Febe
dc.contributor.author	Barnard, Etienne
dc.contributor.researchID	21021287 - Barnard, Etienne
dc.date.accessioned	2018-03-02T13:10:06Z
dc.date.available	2018-03-02T13:10:06Z
dc.date.issued	2015
dc.description.abstract	Large amounts of untranscribed audio data are generated every day. These audio resources can be used to develop robust acoustic models that can be used in a variety of speech-based systems. Manually transcribing this data is resource intensive and requires funding, time and expertise. Lightly-supervised training techniques, however, provide a means to rapidly transcribe audio, thus reducing the initial resource investment to begin the modelling process. Our findings suggest that the lightly-supervised training technique works well for English but when moving to an agglutinative language, such as isiZulu, the process fails to achieve the performance seen for English. Additionally, phone-based performances are significantly worse when compared to an approach using word-based language models. These results indicate a strong dependence on large or well-matched text resources for lightly-supervised training techniques.	en_US
dc.description.sponsorship	Multilingual Speech Technologies, North-West University, Vanderbijlpark, South Africa Human Language Technologies Research Group, Meraka Institute, CSIR, South Africa Department of Electrical and Electronic Engineering, Stellenbosch University, South Africa	en_US
dc.identifier.citation	Neil Kleynhans, Febe de Wet and Etienne Barnard, “Unsupervised acoustic model training: comparing South African English and isiZulu”, in Proc. Annual Symp. Pattern Recognition Association of South Africa (PRASA), pp 136 - 141, Port Elizabeth, South Africa, 2015. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications]	en_US
dc.identifier.uri	http://ieeexplore.ieee.org/document/7359512/
dc.identifier.uri	https://researchspace.csir.co.za/dspace/handle/10204/8629
dc.identifier.uri	http://hdl.handle.net/10394/26490
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	Lightly-supervised training	en_US
dc.subject	Unsupervised training	en_US
dc.subject	Automatic transcription generation	en_US
dc.subject	Audio harvesting	en_US
dc.subject	English, isiZulu	en_US
dc.title	Unsupervised acoustic model training: comparing South African English and isiZulu	en_US
dc.type	Presentation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: kleynhans-2015-model-training.pdf
Size:: 111.76 KB
Format:: Adobe Portable Document Format
Description:: kleynhans-2015-model-training

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Engineering