Optimized dynamic programming search for automatic speech recognition on a Graphics Processing Unit (GPU) platform using Compute Unified Device Architecture (CUDA)
Letswamotse, Babedi Betty
MetadataShow full item record
In a typical recognition process, there are substantial parallelization challenges in concurrently assessing thousands of alternative interpretations of a speech utterance to find the most probable one. During this process, uttered words are converted into fragments. Decoding these fraagments to produce relevant output is a computationally expensive task. To optimize Viterbi search requires a certain level of parallelism since search is a parallel process. We find that a better way to optimize speech recognition search is by the use of parallel architectures such as graphic processing units (GPUs). GPUs provide large computational power at a ve1y low expense which positions them as viable global accelerators. We implemented the speech recognition Viterbi search algorithm on CPU and GeForce 8800 GTX GPU based systems in three implementations. The first implementation was implemented on a CPU based system, and then the original Viterbi search algorithm with the application of loop unrolling was implemented on the CPU based system and also on the GPU based systems. The GPU optimised implementation achieved a 30x speedup over the original CPU implementation and 8 x speedup over the CPU implementation with the application of loop unrolling whereas the CPU implementation with the application of loop unrolling achieved a 4 x speedup over the original CPU implementation of the Viterbi search algorithm. Achievements from our GPU optimized implementation have positively impacted on the overall speech recognition accuracy, thereby contributing across the field of automatic speech recognition.
- Engineering