Machine learning and deep learning techniques for natural language processing with application to audio recordings

Motitswane, Olorato Glendah

View/Open

Motitswane_OG.pdf (1.822Mb)

Date

2023

Author

Motitswane, Olorato Glendah

Metadata

Show full item record

Abstract

Many debt collection companies need to rely on research focusing on data analysis methods that can assist them to analyse their unstructured data which holds information that could help them to better assign their collection agents to high repayment probable accounts. These types of accounts are characterised by the debtor’s ability to repay which comprise their employment status among many other driving factors. Unfortunately, analysing unstructured data is extremely challenging as it comes in natural forms such as audio recordings, videos and images, to mention a few. The aim of this study was to seek for data analysis methods that can accurately predict the employment status of the debtor using audio call recordings. Transcription of the recordings to text was done using Automatic Speech Recognition (ASR), followed by data cleaning and the transcribed text was represented in numerical form using the Term Frequency-Inverse Document Frequency (TF- IDF) and the Count Vectorizer. The study then compared the accuracy of Artificial Neural Network (ANN) and Naïve Bayes classifiers in predicting the employment status of the debtor. To evaluate the performance of the ASR transcription method, word error rate (WER) was used, for text and to compare ANN and Naïve Bayes, the accuracy, recall and F1-Score were used. An overall WER of 106.93 was archived by the speech recognition ASR method. ANN with TF-IDF was identified as the best model for predicting employment status from transcribed audio recordings.

URI

https://orcid.org/0000.0003.3905.1633
http://hdl.handle.net/10394/42346

Collections

Economic and Management Sciences [4593]