A distributed approach to speech resource collection

Molapo, Raymond; Barnard, Etienne; De Wet, Febe

View/Open

prasa2013-11.pdf (208.6Kb)

Date

2013

Author

Molapo, Raymond

Barnard, Etienne

De Wet, Febe

Metadata

Show full item record

Abstract

We describe the integration of several tools to enable the end-to-end development of an Automatic Speech Recognition system in a typical under-resourced language. Google App Engine is employed as the core environment for data verification, storage and distribution, and used in conjunction with existing too ls for gathering text and for speech data recording. We analyse the data acquired by each of the tools and develop an ASR system in Shona, an important under-resourced language of Southern Africa. Although unexpected logistical problems complicated the process, we were able to collect a usable Shona speech corpus for the development of the first Automatic Speech Recognition system in that language.

URI

http://hdl.handle.net/10394/12117

Collections

Conference Papers - Vaal Triangle Campus [84]
Faculty of Natural and Agricultural Sciences [4855]