Search

Now showing items 1-10 of 15

The NCHLT Speech Corpus of the South African languages

Barnard, Etienne; Davel, Marelie H.; van Heerden, Charl; De Wet, Febe; Badenhorst, Jaco (Workshop Spoken Language Technologies for Under-resourced Languages (SLTU), 2014)

The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven official languages of South Africa. We describe the design and development processes that were ...

Wolof Speech Recognition Model of Digits and Limited-Vocabulary Based on HMM and ToolKit

Tamgno, James K.; Barnard, Etienne; Lishou, Claude; Richomme, Morgan (Computer Modelling and Simulation (UKSim), 2012 UKSim 14th International Conference on, 2012)

This paper is concerned with Automatic Speech Recognition (ASR) using trainable systems. The aim of this work is to build acoustic models for spoken language Wolof. This is done by employing Hidden Markov Models (HMM) and ...

G2P variant prediction techniques for ASR and STD

Davel, Marelie H.; van Heerden, Charl; Barnard, Etienne (Interspeech 2013, 2013)

Introducing pronunciation variants into a lexicon is a balancing act: incorporating necessary variants can improve automatic speech recognition (ASR) and spoken term detection (STD) performance by capturing some of the ...

The South African directory enquiries (SADE) name corpus

Thirion, Jan Willem Frederick; Van Heerden, Charl Johannes; Giwa, Oluwapelumi; Davel, Marelie Hattingh (Springer, 2020)

We present the design and development of a South African directory enquiries (DE) corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first language speakers of four ...

The South African directory enquiries (SADE) name corpus

Thirion, Jan W.F.; Van Heerden, Charl; Giwa, Oluwapelumi; Davel, Marelie H. (Springer, 2019)

We present the design and development of a South African directory enquiries corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first-language speakers of four ...

Category-based phoneme-to-grapheme transliteration

Basson, Willem D.; Davel, Marelie H. (International Speech Communication Association ( ISCA ), 2013)

Grapheme-based speech recognition systems are faster to develop but typically do not reach the same level of performance as phoneme-based systems. In this paper we introduce a technique for improving the performance of ...

Collecting and evaluating speech recognition corpora for 11 South African languages

Badenhorst, Jaco; Van Heerden, Charl; Barnard, Etienne; Davel, Marelie H. (Springer, 2011)

We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which contains data from the eleven official languages of South Africa. Because of practical constraints, the amount of ...

Implications of Sepedi/English code switching for ASR systems

Modipa, Thipe I.; De Wet, Febe; Davel, Marelie H. (Pattern recognition association of South Africa (PRASA), 2013)

Code switching (the process of switching from one language to another during a conversation) is a common phenomenon in multilingual environments. Where a minority and dominant language coincide, code switching from the ...

Comparing grapheme-based and phoneme-based speech recognition for Afrikaans

Basson, Willem D.; Davel, Marelie H. (PRASA, 2012)

This paper compares the recognition accuracy of a phoneme-based automatic speech recognition system with that of a grapheme-based system, using Afrikaans as case study. The first system is developed using a conventional ...

Medium-vocabulary speech recognition for under-resourced languages

Van Heerden, Charl J.; Barnard, Etienne; Davel, Marelie H. (SLTU, 2012)

We report on the development of speech-recognition systems that are able to perform accurate recognition on mediumvocabulary tasks (i.e. tasks that require distinctions between approximately 200 different terms). We are ...

Search

Filters

The NCHLT Speech Corpus of the South African languages

Wolof Speech Recognition Model of Digits and Limited-Vocabulary Based on HMM and ToolKit

G2P variant prediction techniques for ASR and STD

The South African directory enquiries (SADE) name corpus

The South African directory enquiries (SADE) name corpus

Category-based phoneme-to-grapheme transliteration

Collecting and evaluating speech recognition corpora for 11 South African languages

Implications of Sepedi/English code switching for ASR systems

Comparing grapheme-based and phoneme-based speech recognition for Afrikaans

Medium-vocabulary speech recognition for under-resourced languages