next up previous contents
Next: Segmentation and Labeling Up: Annotation Previous: Transcription Tools   Contents


Tagging

Tagging refers to the markup of categorical classes on the words or larger chunks of the speech signal. Tagging does not require a direct relation to the physical time scale, but usually its labels or tags refer to the transcript.

Examples:

The relation of the tags to the words or larger chunks may either be expressed by repeating the transcript in the tagging or by giving pointers (usually word numbers) to the transcript. The latter method has the advantage that typos or other errors in the transcript may be pruned without affecting the tagging given that the order of words in the transcript remains the same.

For example in the BAS Partitur Format (BPF)8.10 the transcript and dialog act labeling of a dialog turn could be represented as follows:

ORT:  0  good
ORT:  1  morning
ORT:  2  have 
ORT:  3  we
ORT:  4  met
ORT:  5  before
DIA:  0,1  GREETING_AB
DIA:  2,3,4,5   QUERY_AB
As you can see the transcript assigns a unique number to every word which than may be used in different tagging (and segmentations) as a pointer to words.

Taggings are produced manually or automatically. In case of manual tagging the same measures have to be taken as in the case of complex transcripts to ensure consistent and reproducible results (see section [*]).


next up previous contents
Next: Segmentation and Labeling Up: Annotation Previous: Transcription Tools   Contents
BITS Projekt-Account 2004-06-01