Optimized dynamic programming search for automatic speech recognition on a Graphics Processing Unit (GPU) platform using Compute Unified Device Architecture (CUDA)
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In a typical recognition process, there are substantial parallelization challenges in concurrently
assessing thousands of alternative interpretations of a speech utterance to find the most probable
one. During this process, uttered words are converted into fragments. Decoding these fraagments
to produce relevant output is a computationally expensive task. To optimize Viterbi search
requires a certain level of parallelism since search is a parallel process. We find that a better way
to optimize speech recognition search is by the use of parallel architectures such as graphic
processing units (GPUs). GPUs provide large computational power at a ve1y low expense which
positions them as viable global accelerators. We implemented the speech recognition Viterbi
search algorithm on CPU and GeForce 8800 GTX GPU based systems in three implementations.
The first implementation was implemented on a CPU based system, and then the original Viterbi
search algorithm with the application of loop unrolling was implemented on the CPU based
system and also on the GPU based systems. The GPU optimised implementation achieved a 30x
speedup over the original CPU implementation and 8 x speedup over the CPU implementation
with the application of loop unrolling whereas the CPU implementation with the application of
loop unrolling achieved a 4 x speedup over the original CPU implementation of the Viterbi search
algorithm. Achievements from our GPU optimized implementation have positively impacted on
the overall speech recognition accuracy, thereby contributing across the field of automatic speech
recognition.
Description
MSc (Computer Science), North-West University, Mafikeng Campus, 2014