Defining a hierarchy of models: we can compile different HMMs to create models of utterances
We can do some pruning, remove some tokens while proceeding, reduce computation cost (Maybe Heuristic is also can be helpful in such case.)
We use the Markov property of HMMs (i.e. conditional independence assumptions) to make computing probabilities of observation sequences easier
HMM training using the Baum-Welch algorithm. This gives a very high level overview of forward and backward probability calculation on HMMs and Expectation-Maximization as a way to optimise model parameters. The maths is in the readings (but not examinable).
Origin: Module 10 – Speech Recognition – Connected speech & HMM training Translate + Edit: YangSier (Homepage)