Synthetic triphones from trajectory-based feature distributions
Abstract
We experiment with a new method to create
synthetic models of rare and unseen triphones in order to supplement
limited automatic speech recognition (ASR) training
data. A trajectory model is used to characterise seen transitions
at the spectral level, and these models are then used to create
features for unseen or rare triphones. We find that a fairly
restricted model (piece-wise linear with three line segments per
channel of a diphone transition) is able to represent training
data quite accurately. We report on initial results when creating
additional triphones for a single-speaker data set, finding small
but significant gains, especially when adding additional samples
of rare (rather than unseen) triphones.
URI
http://ieeexplore.ieee.org/document/7359509/https://researchspace.csir.co.za/dspace/handle/10204/8737
http://hdl.handle.net/10394/26487
Collections
- Faculty of Engineering [1136]