• Login
    View Item 
    •   NWU-IR Home
    • Electronic Theses and Dissertations (ETDs)
    • Humanities
    • View Item
    •   NWU-IR Home
    • Electronic Theses and Dissertations (ETDs)
    • Humanities
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Benoemde–entiteitherkenning vir Afrikaans

    Thumbnail
    View/Open
    Matthew_GD.pdf (929.4Kb)
    Date
    2013
    Author
    Matthew, Gordon Derrac
    Metadata
    Show full item record
    Abstract
    According to the Constitution of South Africa, the government is required to make all the infor-mation in the ten indigenous languages of South Africa (excluding English), available to the public. For this reason, the government made the information, that already existed for these ten languages, available to the public and an effort is also been made to increase the amount of resources available in these languages (Groenewald & Du Plooy, 2010). This release of infor-mation further helps to implement Krauwer‟s (2003) idea that there is an inventory for the mini-mal number of language-related resources required for a language to be competitive at the level of research and teaching. This inventory is known as the "Basic Language Resource Kit" (BLARK). Since most of the languages in South Africa are resource scarce, it is of the best in-terest for the cultural growth of the country, that each of the indigenous South African languages develops their own BLARK. In Chapter 1, the need for the development of an implementable named entity recogniser (NER) for Afrikaans is discussed by first referring to the Constitution of South Africa’s (Republic of South Africa, 2003) language policy. Secondly, the guidelines of BLARK (Krauwer, 2003) are discussed, which is followed by a discussion of an audit that focuses on the number of re-sources and the distribution of human language technology for all eleven South African languages (Sharma Grover, Van Huyssteen & Pretorius, 2010). In respect of an audit conducted by Sharma Grover et al. (2010), it was established that there is a shortage of text-based tools for Afrikaans. This study focuses on this need for text-based tools, by focusing on the develop-ment of a NER for Afrikaans. In Chapter 2 a description is given on what an entity and a named entity is. Later in the chapter the process of technology recycling is explained, by referring to other studies where the idea of technology recycling has been applied successfully (Rayner et al., 1997). Lastly, an analysis is done on the differences that may occur between Afrikaans and Dutch named entities. These differences are divided into three categories, namely: identical cognates, non-identical cognates and unrelated entities. Chapter 3 begins with a description of Frog (van den Bosch et al, 2007), the Dutch NER used in this study, and the functions and operation of its NER-component. This is followed by a description of the Afrikaans-to-Dutch-converter (A2DC) (Van Huyssteen & Pilon, 2009) and finally the various experiments that were completed, are explained. The study consists of six experiments, the first of which was to determine the results of Frog on Dutch data. The second experiment evaluated the effectiveness of Frog on unchanged (raw) Afrikaans data. The following two experiments evaluated the results of Frog on “Dutched” Afrikaans data. The last two experiments evaluated the effectiveness of Frog on raw and “Dutched” Afrikaans data with the addition of gazetteers as part of the pre-processing step. In conclusion, a summary is given with regards to the comparisons between the NER for Afri-kaans that was developed in this study, and the NER-component that Puttkammer (2006) used in his tokeniser. Finally a few suggestions for future research are proposed.
    URI
    http://hdl.handle.net/10394/10170
    Collections
    • Humanities [2697]

    Copyright © North-West University
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of NWU-IR Communities & CollectionsBy Issue DateAuthorsTitlesSubjectsAdvisor/SupervisorThesis TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsAdvisor/SupervisorThesis Type

    My Account

    LoginRegister

    Copyright © North-West University
    Contact Us | Send Feedback
    Theme by 
    Atmire NV