G2P variant prediction techniques for ASR and STD

Davel, Marelie H.; van Heerden, Charl; Barnard, Etienne

G2P variant prediction techniques for ASR and STD

dc.contributor.author	Davel, Marelie H.
dc.contributor.author	van Heerden, Charl
dc.contributor.author	Barnard, Etienne
dc.date.accessioned	2018-03-05T12:03:07Z
dc.date.available	2018-03-05T12:03:07Z
dc.date.issued	2013
dc.description.abstract	Introducing pronunciation variants into a lexicon is a balancing act: incorporating necessary variants can improve automatic speech recognition (ASR) and spoken term detection (STD) performance by capturing some of the variability that occurs naturally; introducing superfluous variants can lead to increased confusability and a decrease in performance. We experiment with two very different grapheme-to-phoneme variant prediction techniques and analyze the variants generated, as well as their effect when used within fairly standard ASR and STD systems with unweighted lexicons. Specifically, we compare the variants generated by joint sequence models, which use probabilistic information to generate as many or as few variants as required, with a more discrete approach: the use of pseudophonemes within the default-and-refine algorithm. We evaluate results using three of the 2013 Babel evaluation languages with quite different variant characteristics - Tagalog, Pashto and Turkish - and find that there are clear trends in how the number and type of variants influence performance, and that the implications for lexicon creation for ASR and STD are different. Index Terms: pronunciation variants, speech recognition, spoken term detection, grapheme-to-phoneme	en_US
dc.description.sponsorship	This work was supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Defense U.S. Army Research Laboratory contract numberW911NF-12- C-0013. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either express or implied, of IARPA, DoD/ARL, or the U.S. Government.	en_US
dc.identifier.citation	Marelie Davel, Charl van Heerden, and Etienne Barnard. “G2P variant prediction techniques for ASR and STD”, in Proc. Interspeech, pp 1831-1835, Lyon, France, 2013. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications]	en_US
dc.identifier.uri	http://www.isca-speech.org/archive/archive_papers/interspeech_2013/i13_1831.pdf
dc.identifier.uri	http://hdl.handle.net/10394/26503
dc.language.iso	en	en_US
dc.publisher	Interspeech 2013	en_US
dc.subject	Standard ASR and STD systems	en_US
dc.subject	Default-and-refine algorithm	en_US
dc.subject	Babel evaluation languages	en_US
dc.subject	Speech recognition	en_US
dc.subject	G2P	en_US
dc.title	G2P variant prediction techniques for ASR and STD	en_US
dc.type	Presentation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: davel-2013-variants.pdf
Size:: 65.2 KB
Format:: Adobe Portable Document Format
Description:: davel-2013-variants

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Engineering