• Login
    View Item 
    •   NWU-IR Home
    • Research Output
    • Faculty of Engineering
    • View Item
    •   NWU-IR Home
    • Research Output
    • Faculty of Engineering
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Medium-vocabulary speech recognition for under-resourced languages

    Thumbnail
    Date
    2012
    Author
    Van Heerden, Charl J.
    Barnard, Etienne
    Davel, Marelie H.
    Metadata
    Show full item record
    Abstract
    We report on the development of speech-recognition systems that are able to perform accurate recognition on mediumvocabulary tasks (i.e. tasks that require distinctions between approximately 200 different terms). We are able to achieve error rates of less than 5% (our design goal) on four underresourced languages as well as English, by using training corpora that contain 70–100 hours of speech per language. The majority of the errors stem from words such as abbreviations, foreign words or names, which do not adhere to the standard orthography of the target language. We also find that recognition accuracy does not depend strongly on the number of occurrences of a term in the training set or the length of the term to be recognized, and that a few problematic speakers are responsible for a disproportionate number of errors.
    URI
    http://hdl.handle.net/10394/13632
    http://www.mica.edu.vn/sltu2012/files/proceedings/26.pdf
    Collections
    • Conference Papers - Vaal Triangle Campus [84]
    • Faculty of Engineering [1107]
    • Faculty of Natural and Agricultural Sciences [4782]

    Copyright © North-West University
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of NWU-IR Communities & CollectionsBy Issue DateAuthorsTitlesSubjectsAdvisorThis CollectionBy Issue DateAuthorsTitlesSubjectsAdvisor

    My Account

    LoginRegister

    Copyright © North-West University
    Contact Us | Send Feedback
    Theme by 
    Atmire NV