Search
Now showing items 1-4 of 4
The NCHLT Speech Corpus of the South African languages
(Workshop Spoken Language Technologies for Under-resourced Languages (SLTU), 2014)
The NCHLT speech corpus contains wide-band speech from approximately
200 speakers per language, in each of the eleven
official languages of South Africa. We describe the design and
development processes that were ...
G2P variant prediction techniques for ASR and STD
(Interspeech 2013, 2013)
Introducing pronunciation variants into a lexicon is a balancing
act: incorporating necessary variants can improve automatic
speech recognition (ASR) and spoken term detection (STD)
performance by capturing some of the ...
Efficient harvesting of Internet audio for resource-scarce ASR
(Interspeech 2011, 2011)
Spoken recordings that have been transcribed for human reading
(e.g. as captions for audiovisual material, or to provide alternative
modes of access to recordings) are widely available in many
languages. Such recordings ...
Performance analysis of a multilingual directory enquiries application
(Pattern Recognition Association of South Africa and Mechatronics International Conference, 2014)
In a multilingual society such as South Africa, a
practical directory enquiries (DE) application should be able to
serve users from various language backgrounds with information
relating to names in various languages: ...