Show simple item record

dc.contributor.advisorMontshiwa, T.V.
dc.contributor.authorMotitswane, Olorato Glendah
dc.date.accessioned2023-11-23T07:33:24Z
dc.date.available2023-11-23T07:33:24Z
dc.date.issued2023
dc.identifier.urihttps://orcid.org/0000.0003.3905.1633
dc.identifier.urihttp://hdl.handle.net/10394/42346
dc.descriptionMCur (Statistics), North-West University, Mahikeng Campusen_US
dc.description.abstractMany debt collection companies need to rely on research focusing on data analysis methods that can assist them to analyse their unstructured data which holds information that could help them to better assign their collection agents to high repayment probable accounts. These types of accounts are characterised by the debtor’s ability to repay which comprise their employment status among many other driving factors. Unfortunately, analysing unstructured data is extremely challenging as it comes in natural forms such as audio recordings, videos and images, to mention a few. The aim of this study was to seek for data analysis methods that can accurately predict the employment status of the debtor using audio call recordings. Transcription of the recordings to text was done using Automatic Speech Recognition (ASR), followed by data cleaning and the transcribed text was represented in numerical form using the Term Frequency-Inverse Document Frequency (TF- IDF) and the Count Vectorizer. The study then compared the accuracy of Artificial Neural Network (ANN) and Naïve Bayes classifiers in predicting the employment status of the debtor. To evaluate the performance of the ASR transcription method, word error rate (WER) was used, for text and to compare ANN and Naïve Bayes, the accuracy, recall and F1-Score were used. An overall WER of 106.93 was archived by the speech recognition ASR method. ANN with TF-IDF was identified as the best model for predicting employment status from transcribed audio recordings.en_US
dc.language.isoenen_US
dc.publisherNorth-West University (South Africa)en_US
dc.subjectNatural Language Processingen_US
dc.subjectAutomatic Speech Recognitionen_US
dc.subjectTerm Frequency-Inverse Document Frequency Vectorizeren_US
dc.subjectCount Vectorizeren_US
dc.subjectData Augmentationen_US
dc.subjectNaïve Bayesen_US
dc.subjectArtificial Neural Networken_US
dc.titleMachine learning and deep learning techniques for natural language processing with application to audio recordingsen_US
dc.typeThesisen_US
dc.description.thesistypeMastersen_US
dc.contributor.researchID22297812 - Montshiwa, Volition Tlhalitshi (Supervisor)


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record