Tone realisation for speech synthesis of Yorùbá

Van Niekerk, Daniel Rudolph

dc.contributor.advisor	Barnard, Etienne
dc.contributor.author	Van Niekerk, Daniel Rudolph
dc.date.accessioned	2015-01-28T08:01:27Z
dc.date.available	2015-01-28T08:01:27Z
dc.date.issued	2014
dc.identifier.uri	http://hdl.handle.net/10394/13054
dc.description	PhD (Information Technology), North-West University, Vaal Triangle Campus, 2014	en_US
dc.description.abstract	Speech technologies such as text-to-speech synthesis (TTS) and automatic speech recognition (ASR) have recently generated much interest in the developed world as a user-interface medium to smartphones [1, 2]. However, it is also recognised that these technologies may potentially have a positive impact on the lives of those in the developing world, especially in Africa, by presenting an important medium for access to information where illiteracy and a lack of infrastructure play a limiting role [3, 4, 5, 6]. While these technologies continually experience important advances that keep extending their applicability to new and under-resourced languages, one particular area in need of further development is speech synthesis of African tone languages [7, 8]. The main objective of this work is acoustic modelling and synthesis of tone for an African tone,language: Yorùbá. We present an empirical investigation to establish the acoustic properties of tone in Yorùbá, and to evaluate resulting models integrated into a Hidden Markov model-based (HMMbased) TTS system. We show that in Yorùbá, which is considered a register tone language, the realisation of tone is not solely determined by pitch levels, but also inter-syllable and intra-syllable pitch dynamics. Furthermore, our experimental results indicate that utterance-wide pitch patterns are not only a result of cumulative local pitch changes (terracing), but do contain a significant gradual declination component. Lastly, models based on inter- and intra-syllable pitch dynamics using underlying linear pitch targets are shown to be relatively efficient and perceptually preferable to the current standard approach in statistical parametric speech synthesis employing HMM pitch models based on context-dependent phones. These findings support the applicability of the proposed models in under-resourced conditions.	en_US
dc.language.iso	en	en_US
dc.publisher	North West University	en_US
dc.subject	Speech synthesis	en_US
dc.subject	Text-to-speech	en_US
dc.subject	Intonation model	en_US
dc.subject	Target approximation	en_US
dc.subject	Tone language	en_US
dc.subject	Yorùbá	en_US
dc.subject	Under-resourced languages	en_US
dc.title	Tone realisation for speech synthesis of Yorùbá	en
dc.type	Thesis	en_US
dc.description.thesistype	Doctoral	en_US
dc.contributor.researchID	21021287 - Barnard, Etienne (Supervisor)

Files in this item

Name:: Van Niekerk_DR.pdf
Size:: 3.480Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Natural and Agricultural Sciences [2777]

Show simple item record