``Automatic segmentation refers to the process whereby segment boundaries are assigned automatically by a program. This will probably be an HMM-based speech recognizer that has been given the correct symbol string as input. The output boundaries may not be entirely accurate, especially if the training data was sparse. Semi-automatic segmentation refers to the process whereby this automatic segmentation is followed by manual checking and editing of the segment boundaries.At the time of writing8.12 there are only a few fully automatic methods known that yield usable results. These are
This form of segmenting is motivated by the need to segment very large databases for the purpose of training ever more comprehensive recognizers. Manual segmentation is extremely costly in time and effort, and automatic segmentation, if sufficiently accurate, could provide a shortcut. However, it is still necessary for the researcher to derive the correct symbol string to input to the autosegmenter. This may be derived automatically from an orthographic transcription, in which case it will not always correspond to the particular utterance unless manually checked and edited. The amount of inaccuracy that is acceptable will depend on the uses to which the database is to be put, and its overall size.'' (From [2], p. 153.)