Towards lecture transcription in resource-scarce environments
dc.contributor.author | De Villiers, Pieter | |
dc.contributor.author | Jooste, Petri | |
dc.contributor.author | Van Heerden, Carel J. | |
dc.contributor.author | Barnard, Etienne | |
dc.contributor.researchID | 21281858 - De Villiers, Pieter Theunis | |
dc.contributor.researchID | 10080694 - Jooste, Josef Petrus | |
dc.contributor.researchID | 11539151 - Van Heerden, Carel Jacobus | |
dc.contributor.researchID | 21021287 - Barnard, Etienne | |
dc.date.accessioned | 2014-11-04T05:43:19Z | |
dc.date.available | 2014-11-04T05:43:19Z | |
dc.date.issued | 2012 | |
dc.description.abstract | We present progress towards automated Lecture Transcription (LT) in resource scarce environments. Our development has focused on the transcription of lectures in Afrikaans from two faculties at North-West University. A bootstrapping procedure is followed to filter and select well-aligned segments of speech. These segments are then used to train acoustic models. Initial work towards language modeling for LT in a resource-scarce environment is also presented; manual lecture transcriptions are combined with text mined from other sources such as study guides to train language models. Interpolation results indicate that study guides are a useful resource for language modeling, whereas general text (obtained from a publisher of Afrikaans books) is less useful in this context. Our findings are confirmed by the reduced word error rates (WERs) obtained from our off-line speech-recognition system for Lecture Transcription. | en_US |
dc.description.uri | http://www.prasa.org/index.php/2012-03-07-10-55-15 | |
dc.identifier.citation | De Villiers, P.T. et al. 2012. Towards lecture transcription in resource-scarce environments. Proceedings of the Twenty-Third Annual Symposium of the Pattern Recognition Association of South Africa. Pretoria. p.138-143. [http://www.prasa.org/] | en_US |
dc.identifier.isbn | 978-0-620-54601-0 | |
dc.identifier.uri | http://hdl.handle.net/10394/12123 | |
dc.language.iso | en | en_US |
dc.publisher | Pattern recognition association of South Africa (PRASA) | en_US |
dc.subject | Lecture transcription | en_US |
dc.subject | Afrikaans | en_US |
dc.subject | Kaldi | en_US |
dc.subject | Dynamic programming | en_US |
dc.subject | Language model | en_US |
dc.subject | Resource-scarce | en_US |
dc.title | Towards lecture transcription in resource-scarce environments | en_US |
dc.type | Article | en_US |