next up previous contents
Next: Enriched Dictionary - PHONOLEX Up: Examples Previous: Simple List - Verbmobil   Contents

Simple List - The HTK Standard

In HTK9.7 a pronunciation dictionary contains one entry in each line. The first column contains the word ID (usually a standard orthographic spelling) followed by an optional number denoting an a-posteriori probability that this pronunciation variant occurs given the fact that the word has occurred. In the remainder of the line the linguistic units are listed (separated by white space) that code the corresponding pronunciation of the word entry. As indicated by the possible a-posteriori probability a HTK dictionary may contain more than one entry per word ID. If no a-posteriori probabilities are given, these variants are considered to be equally probable; otherwise the given probabilities should be sum up to 1 for all entries belonging to the same word ID.

Note that this standard is based on a ASR system and does not define the orthographic nor the phonemic coding scheme. Therefore it is not sufficiant to say in the documentation that the dictionary is in the HTK format; you have to document your coding schemes as well.

going   0.856   g @ U I N
going   0.144   g @ U I n


next up previous contents
Next: Enriched Dictionary - PHONOLEX Up: Examples Previous: Simple List - Verbmobil   Contents
BITS Projekt-Account 2004-06-01