The VERBMOBIL 1 Speaker Database

Last update of this page: 2014-03-27

Actual Information about the Verbmobil 1 Speaker Database (AufDat.txt)

The recording institutes in Bonn, Kiel, Karlsruhe and Munich have set up a common format for the recording protocoll used for VERBMOBIL 1 corpus, that was disseminated the first time with VM 3.0. Meanwhile all Verbmobil 1 recordings were included in this database.

Note that Verbmobil 2 has different format for metadata storage, Refer to the Verbmobil 2 documentation.

The protocoll file contains an ordered table. The columns are seperated by TAB (ASCII 9). Each line - which contains the information about one dialog recording - is finished by a CR, CR/LF or LF (depending on your OS). If the item in a column is unknown, a '.' is entered to simplify processing with UNIX tools like AWK. German 'Umlaute' are written in LaTeX format. The entries are written in German.

The lines are sorted to CD version, recording site and dialog number (in this precedence). The order of one field is as follows:

  1. Recording site
  2. Date
  3. Number of channels
  4. Recording control (weich:, hart:)
  5. CD volume
  6. Dialog number
  7. Scenario
  8. Period of appointments
  9. Number of appointments
  10. Used calendar
  11. Channel A (this starts specific information about channel A)
  12. Close microphon A
  13. Room microphone A
  14. Enviroment A
  15. Speaker ID A
  16. Sex A
  17. Date of Birth A
  18. Place of education A
  19. Occupation A
  20. Channel B (this starts specific information about channel B)
  21. Close microphon B
  22. Room microphone B
  23. Enviroment B
  24. Speaker ID B
  25. Sex B
  26. Date of Birth B
  27. Place of education B
  28. Occupation B
This format allows easy exchange with database software or the processing with UNIX tools.

AufDat.txt

Online Extract: Speaker Numbers

Online Extract: Appointments


Florian Schiel