Trajectory behaviour at different phonemic context sizes
Abstract
We propose a piecewise-linear model for the temporal trajectories
of Mel Frequency Cepstral Coefficients during phone transitions.
As with conventional Hidden Markov Models, the parameters of the
model can be estimated for different phonemic context sizes, but our
model allows for an intuitive understanding of the impact of context size.
We find that the most detailed models, predictably, match the coefficient
tracks best – but when data scarcity forces us to use less detailed models,
different styles of context modelling (clustered triphones versus biphones)
have complementary behaviours. We discuss how this complementarity
may be useful for data-efficient ASR.
URI
https://researchspace.csir.co.za/dspace/bitstream/handle/10204/5600/Badenhorst_2011.pdf?sequence=1&isAllowed=yhttp://www.prasa.org/proceedings/2011/prasa2011-01.pdf
http://hdl.handle.net/10394/26536
Collections
- Faculty of Engineering [1136]