dc.contributor.author | Barnard, Etienne | |
dc.contributor.author | Van Heerden, Charl | |
dc.contributor.author | Hartmann, William | |
dc.contributor.author | Karakos, Damianos | |
dc.contributor.author | Schwartz, Richard | |
dc.contributor.author | Tsakalidis, Stavros | |
dc.contributor.author | Davel, Marelie H. | |
dc.date.accessioned | 2018-03-02T12:50:32Z | |
dc.date.available | 2018-03-02T12:50:32Z | |
dc.date.issued | 2015 | |
dc.identifier.citation | Marelie Davel, Damianos Karakos, Etienne Barnard, Charl van Heerden, Richard Schwartz and Stavros Tsakalidis, William Hartmann, “Exploring minimal pronunciation modeling for low resource languages”, in Proc. Interspeech, pp 538-542, Dresden, Germany, 2015. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications] | en_US |
dc.identifier.isbn | 978-1-61499-700-9 | |
dc.identifier.uri | https://books.google.co.za/books?id=-RGhDQAAQBAJ&pg=PA44&lpg=PA44&dq=Exploring+minimal+pronunciation+modeling+for+low+resource+languages&source=bl&ots=wAYDYAm_Ju&sig=ha5BMCtwoEBjHQTAkyauz2wSSEc&hl=en&sa=X&ved=0ahUKEwjFwPDv1M3ZAhUlKsAKHXrICPkQ6AEIODAC#v=onepage&q=Exploring%20minimal%20pronunciation%20modeling%20for%20low%20resource%20languages&f=false | |
dc.identifier.uri | https://www.lti.cs.cmu.edu/sites/default/files/sitaram%2C%20sunayana.pdf | |
dc.identifier.uri | http://hdl.handle.net/10394/26488 | |
dc.description.abstract | Pronunciation lexicons can range from fully graphemic (modeling
each word using the orthography directly) to fully phonemic
(first mapping each word to a phoneme string). Between these
two options lies a continuum of modeling options. We analyze
techniques that can improve the accuracy of a graphemic system
without requiring significant effort to design or implement.
The analysis is performed in the context of the IARPA Babel
project, which aims to develop spoken term detection systems
for previously unseen languages rapidly, and with minimal human
effort. We consider techniques related to letter-to-sound
mapping and language-independent syllabification of primarily
graphemic systems, and discuss results obtained for six languages:
Cebuano, Kazakh, Kurmanji Kurdish, Lithuanian, Telugu
and Tok Pisin. | en_US |
dc.description.sponsorship | This work was supported by the Intelligence Advanced Research
Projects Activity (IARPA) via Department of Defense
U.S. Army Research Laboratory contract number W911NF-12-
C-0013. The U.S. Government is authorized to reproduce and
distribute reprints for Governmental purposes notwithstanding
any copyright annotation thereon. Disclaimer: The views and
conclusions contained herein are those of the authors and should
not be interpreted as necessarily representing the official policies
or endorsements, either express or implied, of IARPA,
DoD/ARL, or the U.S. Government. | en_US |
dc.language.iso | en | en_US |
dc.publisher | IOS Press Inc | en_US |
dc.subject | Spoken term detection | en_US |
dc.subject | Graphemic systems | en_US |
dc.subject | Pronunciation lexicons | en_US |
dc.title | Exploring minimal pronunciation modeling for low resource languages | en_US |
dc.type | Presentation | en_US |
dc.contributor.researchID | 23607955 - Davel, Marelie Hattingh | |
dc.contributor.researchID | 21021287 - Barnard, Etienne | |