NWU Institutional Repository

Domain adaptation for speaker diarisation in low-resource environments

dc.contributor.advisorDavel, M.H.
dc.contributor.authorVan Wyk, Lucas
dc.contributor.researchID23607955 - Davel, Marelie Hattingh (Supervisor)
dc.date.accessioned2022-07-20T07:53:57Z
dc.date.available2022-07-20T07:53:57Z
dc.date.issued2022
dc.descriptionMEng (Computer Engineering), North-West University, Potchefstroom Campusen_US
dc.description.abstractSpeaker diarisation systems aim to answer the question \who spoke when?" and are useful in providing valuable metadata to downstream applications, such as automatic speech recognition systems. However, speaker diarisation systems, like most applications in the field of speech recognition, are especially challenged by domain-mismatch conditions. In this study, we investigate methods with which to adapt a pre-trained diarisation system to a new target domain when only a small in-domain corpus is available and retraining is therefore not an option. We also develop a method for fine-tuning the adaptation process of a pre-trained speaker diarisation system using cluster analysis. Our domain adaptation process focuses on retraining and adapting the statistical components in a speaker diarisation pipeline, which are inherently domain specific, to the target domain. Lastly, we demonstrate this domain adaptation process in a real-world scenario by adapting a pre-trained diarisation system using a small in-domain dataset consisting of telephonic speech from South African call centres. We show that the adapted system can be used to provide metadata which aids the performance of automatic speech recognition systems through speaker-specific adaptations.en_US
dc.description.thesistypeMastersen_US
dc.identifier.urihttps://orcid.org/0000-0001-8254-4850
dc.identifier.urihttp://hdl.handle.net/10394/39389
dc.language.isoenen_US
dc.publisherNorth-West University (South Africa).en_US
dc.subjectSpeaker diarisationen_US
dc.subjectAutomatic speech recognitionen_US
dc.subjectTime-delay neural net- worksen_US
dc.subjectCluster analysisen_US
dc.subjectDomain adaptationen_US
dc.subjectStatistical speaker modellingen_US
dc.subjectSpeaker embeddingen_US
dc.titleDomain adaptation for speaker diarisation in low-resource environmentsen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Van Wyk L Final.pdf
Size:
2.5 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections