A distributed approach to speech resource collection
Loading...
Date
Authors
Molapo, Raymond
Barnard, Etienne
De Wet, Febe
Journal Title
Journal ISSN
Volume Title
Publisher
Pattern recognition association of South Africa (PRASA)
Abstract
We describe the integration of several tools to enable the end-to-end development of an Automatic Speech Recognition system in a typical under-resourced language. Google App Engine is employed as the core environment for data verification, storage and distribution, and used in conjunction with existing too ls for gathering text and for speech data recording. We analyse the data acquired by each of the tools and develop an ASR system in Shona, an important under-resourced language of Southern Africa. Although unexpected logistical problems complicated the process, we were able to collect a usable Shona speech corpus for the development of the first Automatic Speech Recognition system in that language.
Description
Keywords
Citation
Molapo, R. & Barnard, E., et al. 2013. A distributed approach to speech resource collection. In: Conference Proceedings of the 24th Annual Symposium of the Pattern Recognition Association of South Africa. Pretoria, South Africa. p70-75. [http://www.prasa.org/]