TransitionModel 涉及到的名词术语如下:
phone (one-based): this type of identifier is used throughout the toolkit; it can be converted to a phone name via an OpenFst symbol table. Not necessarily contiguous (the toolkit allows "skips" in the phone indices).
hmm-state (zero-based): this is an index into something of type HmmTopology::TopologyEntry. In the normal case, it is one of {0, 1, 2}.
pdf, or pdf-id (zero-based): this is the index of the p.d.f., as originally allocated by the decision-tree clustering; (see PDF identifiers). There would normally be several thousand pdf-ids in a system.
PDF(probability density function)指的是概率密度函数,比如是一个GMM表示的概率密度函数。对monophone,pdf和每个音素下面的状态是一一对应的。对triphone,不同状态可以分享同一个pdf,而决策树也是根据pdf来进行聚类的。
transition-state, or trans_state (one-based): this is an index that is defined by the TransitionModel itself. Each possible triple of (phone, hmm-state, pdf) maps to a unique transition-state. Think of it is the finest granularity of HMM-state for which transitions are separately estimated.
transition-index, or trans_index (zero-based): this is an index into the "transitions" array of type HmmTopology::HmmState. It numbers the transitions out of a particular transition-state.
transition-id, or trans_id (one-based): each of these corresponds to a unique transition probability in the transition model. There is a mapping from (transition-state, transition-index) to transition-id, and vice versa.
Alignments in Kaldi: By "alignment", we generally mean something of type vector, which contains a sequence of transition-ids. Because transition-ids encode the phone information, it is possible to work out the phonetic sequence from an alignment.
参考:
【1】 http://kaldi-asr.org/doc/hmm.html