dc.contributor.author | Davel, Marelie H. | |
dc.contributor.author | van Heerden, Charl | |
dc.contributor.author | Barnard, Etienne | |
dc.date.accessioned | 2018-03-05T12:03:07Z | |
dc.date.available | 2018-03-05T12:03:07Z | |
dc.date.issued | 2013 | |
dc.identifier.citation | Marelie Davel, Charl van Heerden, and Etienne Barnard. “G2P variant prediction techniques for ASR and STD”, in Proc. Interspeech, pp 1831-1835, Lyon, France, 2013. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications] | en_US |
dc.identifier.uri | http://www.isca-speech.org/archive/archive_papers/interspeech_2013/i13_1831.pdf | |
dc.identifier.uri | http://hdl.handle.net/10394/26503 | |
dc.description.abstract | Introducing pronunciation variants into a lexicon is a balancing
act: incorporating necessary variants can improve automatic
speech recognition (ASR) and spoken term detection (STD)
performance by capturing some of the variability that occurs
naturally; introducing superfluous variants can lead to increased
confusability and a decrease in performance. We experiment
with two very different grapheme-to-phoneme variant prediction
techniques and analyze the variants generated, as well as
their effect when used within fairly standard ASR and STD systems
with unweighted lexicons. Specifically, we compare the
variants generated by joint sequence models, which use probabilistic
information to generate as many or as few variants as
required, with a more discrete approach: the use of pseudophonemes
within the default-and-refine algorithm. We evaluate
results using three of the 2013 Babel evaluation languages
with quite different variant characteristics – Tagalog, Pashto and
Turkish – and find that there are clear trends in how the number
and type of variants influence performance, and that the implications
for lexicon creation for ASR and STD are different.
Index Terms: pronunciation variants, speech recognition, spoken
term detection, grapheme-to-phoneme | en_US |
dc.description.sponsorship | This work was supported by the Intelligence Advanced Research
Projects Activity (IARPA) via Department of Defense
U.S. Army Research Laboratory contract numberW911NF-12-
C-0013. The U.S. Government is authorized to reproduce and
distribute reprints for Governmental purposes notwithstanding
any copyright annotation thereon. Disclaimer: The views and
conclusions contained herein are those of the authors and should
not be interpreted as necessarily representing the official policies
or endorsements, either express or implied, of IARPA,
DoD/ARL, or the U.S. Government. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Interspeech 2013 | en_US |
dc.subject | Standard ASR and STD systems | en_US |
dc.subject | Default-and-refine algorithm | en_US |
dc.subject | Babel evaluation languages | en_US |
dc.subject | Speech recognition | en_US |
dc.subject | G2P | en_US |
dc.title | G2P variant prediction techniques for ASR and STD | en_US |
dc.type | Presentation | en_US |