• Login
    View Item 
    •   NWU-IR Home
    • Research Output
    • Faculty of Engineering
    • View Item
    •   NWU-IR Home
    • Research Output
    • Faculty of Engineering
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Speech data collection in an under-resourced language within a multilingual context

    Thumbnail
    Date
    2014
    Author
    Molapo, Raymond
    Barnard, Etienne
    de Wet, Febe
    Metadata
    Show full item record
    Abstract
    In this paper, we present an end-to-end solution to the development of an automatic speech recognition (ASR) system in typical under-resourced languages, where the target language is likely to be influenced by one more embedded foreign languages. We first describe the collection and processing of the text corpus crawled from the World Wide Web using the Rapid Language Adaptation Toolkit. In particular, we highlight the challenges faced when foreign languages are embedded within the matrix language. Thereafter, we discuss our speech data collection efforts in under-resourced environments. We finally report on a strategy called transliteration that aids to improve recognition results of our grapheme-based automatic speech recognition system in the presence of embedded language words.
    URI
    http://hdl.handle.net/10394/17362
    Collections
    • Faculty of Engineering [1122]
    • Faculty of Natural and Agricultural Sciences [4818]

    Copyright © North-West University
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of NWU-IR Communities & CollectionsBy Issue DateAuthorsTitlesSubjectsAdvisor/SupervisorThesis TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsAdvisor/SupervisorThesis Type

    My Account

    LoginRegister

    Copyright © North-West University
    Contact Us | Send Feedback
    Theme by 
    Atmire NV