Search

Now showing items 1-10 of 33

The Spoken Web Search task at Mediaeval 2012

Metze, Florian; Xavier, Anguera; Barnard, Etienne; Gravier, Guillaume; Davel, Marelie H. (Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013)

In this paper, we describe the “Spoken Web Search” Task, which was held as part of the 2012 MediaEval benchmark evaluation campaign. The purpose of this task was to perform audio search with audio input in four languages, ...

The NCHLT Speech Corpus of the South African languages

Barnard, Etienne; Davel, Marelie H.; van Heerden, Charl; De Wet, Febe; Badenhorst, Jaco (Workshop Spoken Language Technologies for Under-resourced Languages (SLTU), 2014)

The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven official languages of South Africa. We describe the design and development processes that were ...

Stride and translation invariance in CNNs

Mouton, Coenraad; Myburgh, Johannes C.; Davel, Marelie H. (Southern African Conference for Artificial Intelligence Research, 2020)

Convolutional Neural Networks have become the standard for image classification tasks, however, these architectures are not invariant to translations of the input image. This lack of invariance is attributed to the use of ...

Tracking translation invariance in CNNs

Myburgh, Johannes C.; Mouton, Coenraad; Davel, Marelie H. (Southern African Conference for Artificial Intelligence Research, 2020)

Although Convolutional Neural Networks (CNNs) are widely used, their translation invariance (ability to deal with translated inputs) is still subject to some controversy. We explore this question using translation-sensitivity ...

Language Independent Search in MediaEval's Spoken Web Search Task

Metze, Florian; Anguera, Xavier; Barnard, Etienne; Gravier, Guillaume; Davel, Marelie H. (Elsevier Ltd., 2014)

In this paper, we describe several approaches to language-independent spoken term detection and compare their performanceon a common task, namely “Spoken Web Search”. The goal of this part of the MediaEval initiative is ...

G2P variant prediction techniques for ASR and STD

Davel, Marelie H.; van Heerden, Charl; Barnard, Etienne (Interspeech 2013, 2013)

Introducing pronunciation variants into a lexicon is a balancing act: incorporating necessary variants can improve automatic speech recognition (ASR) and spoken term detection (STD) performance by capturing some of the ...

The semi-automated creation of stratified speech corpora

Van Heerden, Carel; Barnard, Etienne; Davel, Marelie H. (Pattern recognition association of South Africa (PRASA), 2013)

Smartphones provide an efficient means for the collection of speech data; however, the quality of the corpora created in this fashion is not predictable. We describe an approach that allows us to post-process and rank ...

The effect of language identification accuracy on speech recognition accuracy of proper names

Giwa, Oluwapelumi; Davel, Marelie H. (Pattern Recognition Association of South Africa and Mechatronics International Conference, 2017)

Utilising the known language of origin of a name can be useful when predicting the pronunciation of the name. When this language is not known, automatic language identification (LID) can be used to influence which ...

Exploring neural network training dynamics through binary node activations

Haasbroek, Daniël G.; Davel, Marelie H. (Southern African Conference for Artificial Intelligence Research, 2020)

Each node in a neural network is trained to activate for a specific region in the input domain. Any training samples that fall within this domain are therefore implicitly clustered together. Recent work has highlighted ...

Solar flare prediction with temporal convolutional networks

Krynauw, Dewald D.; Davel, Marelie H.; Lotz, Stefan (In Proc. South African Forum for Artificial Intelligence Research (FAIR2019), 2019-12)

Sequences are typically modelled with recurrent architectures, but growing research is finding convolutional architectures to also work well for sequence modelling [1]. We explore the performance of Temporal Convolutional ...