NWU Institutional Repository

Domain adaptation for speaker diarisation in low-resource environments

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

North-West University (South Africa).

Abstract

Speaker diarisation systems aim to answer the question \who spoke when?" and are useful in providing valuable metadata to downstream applications, such as automatic speech recognition systems. However, speaker diarisation systems, like most applications in the field of speech recognition, are especially challenged by domain-mismatch conditions. In this study, we investigate methods with which to adapt a pre-trained diarisation system to a new target domain when only a small in-domain corpus is available and retraining is therefore not an option. We also develop a method for fine-tuning the adaptation process of a pre-trained speaker diarisation system using cluster analysis. Our domain adaptation process focuses on retraining and adapting the statistical components in a speaker diarisation pipeline, which are inherently domain specific, to the target domain. Lastly, we demonstrate this domain adaptation process in a real-world scenario by adapting a pre-trained diarisation system using a small in-domain dataset consisting of telephonic speech from South African call centres. We show that the adapted system can be used to provide metadata which aids the performance of automatic speech recognition systems through speaker-specific adaptations.

Description

MEng (Computer Engineering), North-West University, Potchefstroom Campus

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By