Last update of this page: 2014-03-27
The recording institutes in Bonn, Kiel, Karlsruhe and Munich have set up a common format for the recording protocoll used for VERBMOBIL 1 corpus, that was disseminated the first time with VM 3.0. Meanwhile all Verbmobil 1 recordings were included in this database.
Note that Verbmobil 2 has different format for metadata storage, Refer to the Verbmobil 2 documentation.
The protocoll file contains an ordered table. The columns are seperated by TAB (ASCII 9). Each line - which contains the information about one dialog recording - is finished by a CR, CR/LF or LF (depending on your OS). If the item in a column is unknown, a '.' is entered to simplify processing with UNIX tools like AWK. German 'Umlaute' are written in LaTeX format. The entries are written in German.
The lines are sorted to CD version, recording site and dialog number (in this precedence). The order of one field is as follows: