next up previous contents
Next: Other Documents Up: Documentation Previous: Starting Document   Contents

The Core Documentation

The documentation of the Recruitment describes the common profiles of the speakers as well as the recruiting method that was used. For example it might be interesting to know whether the speakers were paid for their job or not. Were they paid more for a successful job? Were there any other sources of motivation? Also legal aspects might be listed here, e.g. the rights of usage of the data.

The documentation of the Recording and the Post-processing is basically a repetition of the corresponding part in the corpus specifications with the slight but important difference that here the real recording conditions should be described. If there exists a Log File of the production, it should be included here. If possible include pictures from the recording setup and recording sites. Draw diagrams to illustrate the exact positions of speakers and microphones.

The Annotation should be documented for each of the used annotation layers in great detail. Not only the mere contents and file formats should be given but also the exact procedures on how the annotations were produced. For manual annotations there must be a copy of the annotation guide lines included here. Education and training of the labelers should be indicated, tools and their usage described.

If you use any automatic procedures, insert a copy of the source code of your scripts or programs here or give proper reference to public domain software and describe exactly how it was used. Describe the methods of quality control that were applied to the annotations; define the character set that is used in the annotation files as well as tag sets, phonetic alphabets etc.

If you are using XML in the annotation files, give pointers to the corresponding DTDs.

The documentation of the Meta Data should contain a precise definition of each entry in the meta data files. Give complete lists of the codes you are using and comment on how the data were gathered. For instance, if an entry in the speaker profile files describes the dialectal variety of a language by naming the state or province of a speaker, you should mention here how this information was obtained: was it from an interview with the speaker (self-assessment), was is by asking for the place of elementary school or was it from a judgment of one or a group of experts about dialects of that language.
If you are using XML in the meta data files, give pointers to the corresponding DTDs.


next up previous contents
Next: Other Documents Up: Documentation Previous: Starting Document   Contents
BITS Projekt-Account 2004-06-01