_/_/_/_/ _/_/ _/_/_/_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/_/_/_/ _/_/_/_/ _/_/_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/_/_/_/ _/ _/ _/_/_/_/ BAVARIAN ARCHIVE FOR SPEECH SIGNALS University of Munich, Institut of Phonetics Schellingstr. 3/II, 80799 Munich, Germany bas@bas.uni-muenchen.de COPYRIGHT University of Munich 1995. All rights reserved. This corpus and software may not be disseminated further - not even partly - without a written permission of the copyright holders. Additional Copyright Holders C Copyright 1994 Universitaet Muenchen, Universitaet Kiel, Unversitaet Bonn, University of Karlsruhe, University of Hildesheim (Translations), Technical University of Braunschweig (Prosodic Labeling). All rights reserved. This corpus and software may not be disseminated further - not even partly - without a written permission of the copyright holders. The CDROM stays a property of the Bavarian Archive for Speech Signals (BAS) until the costs for production and shipping are paid to the BAS. ----------------------------------------------------------------------- VERBMOBIL Dialog-Datenbasis VM 13.05.94 / 22.06.2016 (BAS CLARIN Repo version 3) ----------------------------------------------------------------------- Verbmobil - General The Verbmobil (VM) dialog database I is a collection of German, American and Japanese dialog recordings in the appointment scheduling task. The data were collected during the first phase (1993 - 1996) of the German VM project funded by the German Ministry of Science and Technology (BMBF). For a detailed information about this project please refer to the official VM Web Server (http://www.dfki.uni-sb.de/verbmobil/) or the book: Wahlster W (ed.): Verbmobil: Foundations of Speech-to-Speech Translation; Berlin, Heidelberg, Germany: Springer, 2000. http://www.springer.com/computer/ai/book/978-3-540-67783-3 1242 speakers participated in 2194 recordings. The total corpus amounts to 9GB of data containing 31054 conversational turns distributed on 15 CD-R. Verbmobil - Contents of this volume README : this file README.# : volume-specific documentation data/ : signal files of dialogues doc/ : speaker information, lexical information, documentation of file formats, tools softw/ : BAS standard software package (see doc there) trl/ : transliteration files (VM1 standard) tr2/ : transliteration files (VM2 standard; only on VM1 volumes) par/ : Partitur Format files Please refer to the file README.# (# = volume number) for specific documentation regarding this volume. Verbmobil - Contents of subdir /doc AufDat.doc : Speaker database (German/English recordings only, see JapaneseMetadata for Japanese metadata) AufDat.txt : Speaker database docu partitur.ps : First draft to the BAS Partitur Format (BPF) phondat.doc : Description of the PhonDat formats (out of date) trl_hist.txt : history of updates of transliterations trlfil20.zip : trl parser and filter trllex_d.ps : description of trl format German trllex_e.ps : description of trl format English vm1.map : mapping of BPF tiers to each utterance vm_eng.lex : English pronunciation list vm_ger.lex : German pronunciation list vm_jap.lex : Japanese pronunciation list bpf/ : Description of the BAS Partitur Format SETS/ : Definition of test, development and training sets see InfoDatenSets there JapaneseMetadata : Speaker and dialog metadata files of Japanese recordings Verbmobil - Transliterations There are two different formats for the orthographig/linguistic annotation ('transliteration') used in the Verbmobil projects: TRL and TR2 Verbmobil I : subdirs /trl and /tr2 Verbmobil II : subdir /trl Extension is always *.trl (Refer to the the file trl_hist.txt in the directory /doc for update and version information regarding the TRL files until 1997.) TRL format (Verbmobil I only!): A Postscript version of the actual transliteration lexicon can be found in the file trllex_e.ps in the subdir /doc. If possible, please use the latest version of the transliteration lexicon under following URL: http://www.phonetik.uni-muenchen.de/Forschung/Verbmobil/VMTraLexeng.html The Verbmobil transliterations can be transformed into simpler forms by the tools included in the package /doc/trlfil20.zip which was developed by University of Bielefeld (Prof. Gibbon). TR2 format (Verbmobil I and II): The directory TR2 contains the same transliterations in the much cooler VM2 standard for transliteration. This standard is easier to parse and was first used in the VM2 corpus (Volumes > 14); later the TRL files of Verbmobil I were translated into TR2 format. These adapted Verbmobil I file names have an additional 'x' before the dot to be conform with VMII naming conventions and to mark them as being VMI dialogs (eg. to clarify the language used; this was coded in VMII in the first char of dialog name in a different way). A good description about TR2 can be found in: www.phonetik.uni-muenchen.de/Forschung/Verbmobil/VMtrlex2d.html (German only) A translation into English provided by the CMU can be found in: http://www.phonetik.uni-muenchen.de/forschung/Verbmobil/trllex_e_html/ Since only the TR2 format is provided for the whole corpus, we strongly recommend to use TR2 for your work; or refer directly to the BAS Partitur Format files (BPF) as described in the next section. Verbmobil - Segmental Information All segmental informations corresponding to a signal file are stored in a file with the same prefix as the signal file but the extension '.PAR' (denoting a 'BAS Partitur Format' file). You'll find them in the subdir /par. BPF files are almost compatible to the SAM Label Standard, but have an 'open format definition', that is the format is easily extendable without any changes to software that was written for a former definition of the standard. You will find a description of the principles and basic definitions in the file PARTITUR.PS in the subdir DOC. An on-line definition of the BAS Partitur Format can be found under the following URL: http://www.phonetik.uni-muenchen.de/Bas/BasFormatseng.html A copy of this on-line documentation (and a German translation) is stored in the subdir /doc/bpf. The Verbmobil corpus distribution contains - the original transliteration segmented in word units (TR2) - canonic pronunciation ('citation form') of words (KAN) - lexical access tier (ORT) - signal based prosodic labeling GTobi (PRB) - signal based prosodic boundary labelling (LBG) - signal based prosodic accent labelling (LBG) - syntactic prosodic labelling (PRO) - parts of speech (POS) - lematta (LMA) - phonemic segmentation and labeling (SAP) - MAUS output (MAU) - dialog act segmentation (DIA) - noise classification (NOI) - word segmentation (WOR) - syntax trees (SYN,FUN,LEX) - cross-talk (SUP) Verbmobil - Pronunciation Lists A list of all pronunciations contained in the VM corpus is stored in the file doc/vm_ger.lex, vm_eng.lex and vm_jap.lex respectively. The lists contain a TAB-speparated two-column list with the orthographic coding (corresponding to the BPF ORT tier) in the first column and the corresponding canonical pronunciation coded in SAM-PA (matching the BPF KAN tier) in the second column. The German list is conform with the 'Transcription Conventions for Canonical German' (see www.bas.uni-muenchen.de/Bas/BasGermanPronunciation/) German coding is done in extended German SAM-PA (see /doc/bpf/BasSAMPA). English Coding is in SAM-PA; Japanese coding is in a proprietary format (orthography) that has not yet been documented properly (pending). The pronunciation is coded in X-SAM-PA. Note that the pronunciations lists cover BOTH Verbmobil corpora. Verbmobil - Software You will find software for reading/writing/playing/converting the files for several OS in the subdir SOFTW. Verbmobil - Speaker information You will find speaker information in the file AufDat.txt in the subdir DOC. The list covers the German and American speakers. Japanese recording are described in JapaneseMetadata/###.ENV files. Verbmobil - File formats The original edition (Version *.0.#) has been distributed in the PhonDat 2 format. A decription of this out-of-date format can be found in the file /doc/phondat.doc. The actual edition of BAS (Version *.1.#) contains NIST SPHERE signal files compatible with Verbmobil II data (see details there). Also, the file naming of Verbmobil I data were adapted to the format of Verbmobil 2 (see next section). Verbmobil - Terms dialog = conversation between two persons about one or more appointments appointment = complete negotiation about one appointment (several turns) turn = single, non interrupted utterance of one speaker Structure and names of signal files: Each dialog has a defined name as follows: X###O X : Dialog type K = German, same room, no push button L = German, separated room, no push button M = Geramn, separated room, push button N = German, same room, push button G = German, separated room, push button Q = same as M but 'Denglisch' (Germans speaking English) R = same as N but American English (recording site 'C') or 'Denglish' (recording site 'K') Z = German, test recording in the szenario 'travel planning' by TP 13 Hamburg J = same as G but extended scenario of 1995, 1996 S = same as M but mixed German-English W = same as M but with a Wizard Y = Japanese, same room, push button 'separated room' = acoustically separated, but eye contact through window, acoustical contact only via headphones. 'push button' = only the speaker that has pressed the button is recorded, pushing simultaneously is not possible. ### : dialog number within a recording site (starting 001) O : ID recording site A : Kiel C : CMU D : Muenchen N : Bonn K : Karlsruhe All data to a dialog are stored in a separate subdir named as the dialog, e.g. Q001N, in the directory /data Verbmobil - Files of a dialog A Transliteration TRL (Extension .TRL), ASCII, e.g. M001D.TRL Transliteration TR2 (Extension .TRL), ASCII, e.g. M001DX.TRL (the 'X' is only inserted to get compatible names with Verbmobil II) B Turns, file naming (original edition): X%%%O***.&16 X : see above %%% : see above O : see above & : channel (A/B) 16 : PhonDat signal file 16 kHz sampling rate Header is PhonDat 2 (see doc file DOC/PHONDAT.DOC) Intel words, 16 bit, 16000 Hz sampling rate. Turns, file naming (BAS edition, compatible to Verbmobil II): Turn names consist of the dialog name (char 1-5) and the following: 6th character: technical definition of recording c(lose), r(oom), t(elephone) 7th character: detailed description of recording means (microphone) telephone: m(obile), p(hone,analog), w(ireless), d(ect) close: h(eadset), n(eckband microphone), c(lip microphone) Remark: in VM1 headsets were used exclusively for all recordings: Sennheiser HMD410 room: r(room) 8th character: channel coding [1..n] 9th character: '_' 10th - 12th character: turn number starting with '000' 13th character: '_' 14th - 16th character: speaker ID Header ist NIST SPHERE (see Verbmobil II docu) The extensions code the contents of the file: .nis NIST file .par symbolic information in BPF Verbmobil - Evaluations Under the following URL you can retrieve informations about the Verbmobil I ASR Evaluations: http://sbvsrv.ifn.ing.tu-bs.de/eval.html You will find texts about the definition of training, x-validation and test sets there. Verbmobil - Remarks - Clipping within the speech signal can be detected by checking the header item abs_max (maximal absolute amplitude). - Items of the headers are always stored as Intel word or long words. Verbmobil - Main History (only events that concern all VM volumes) ... 12.03.98 : Filtered German dictionary vm_ger.lex and ORT tier in BAS Partitur files from the following characters: '=%*_' 16.08.00 : Converted VMI signal files from Phondat2 into NIST and placed them into directory /DATA. File names were adapted to VMII naming conventions. Update of all BPF; naming conventions of VMII; old BPFs are retained for backward compatibility; tier TRL not contained in new BPFs any more (TR2 is now used throughout the whole VM corpus!) 30.05.01 : New edition of all BAS Partitur Files (BPF) based on the latest error update. This includes a complete new MAUS annotation. Furthermore, additional previously un-published tiers were added to the distribution such as Syntax Trees, Dialogact Annotation, Syntactic-prosodic Labeling, Prosodic Labeling, Parts-of-Speech-Tagging. 08.06.01 : Edition of the VM Bonus CDROM (VMBONUS) with additional data and documentation that does not fit into the regular VM volumes; Edition of the VM Lexicon Database of the University of Bielefeld. 10.07.01 : Tiers LBP and LBG added to the BAS Partitur Files 30.01.03 : vm_ger.lex completely re-build: The German pronunciation dir of VM I+II now contains only the word items as they appear in the ORT tier of the BPF files. Also the transcription was unified to a more consistant concept of a 'canonical form'. For instance: - /R/ and /r/ was unified to /r/ because it was not clear how these two allophones were used by different transcribers - /a:6/ was replaced by /ar/ 19.08.03 : New edition of all BAS Partitur Files (BPF) of German signal data based on the latest error update: Some minor bugs in the POS, LMA and SAP tiers fixed. Complete re-done pronunciation list for German (vm_ger.lex) according to the new 'Transliteration Conventions for Canonical German' (www.bas.uni-muenchen.de/Bas/BasGermanPronunciation/) Based on the new pronunciation the following tiers in the BPF files have been re-calculated: KAN, MAU 20.08.03 : New tier TLN integrated : the TLN tier contains the translation of the recorded utterance. The transliterations were produced manually by the University of Tuebingen, Prof. Hinrichs. The integrated data are also stored on the volume VMBONUS Please note that the orthographic representation of Japanese (romanji) in these translations is of the original form as used in the original Japanese pronunciation list (vm_jap_org.lex). However, it was never check whether these two data sets (lexicon and translations) are in fact compatible. Use with caution! For details about the TLN tier please refer to the BPF documentation www.bas.uni-muenchen.de/Bas/BasFormatseng.html 29.08.12 : Dialogue n009k consists of two dialogs: deleted the second dialog part, updated AufDat.txt, SETS, vm1.map, trl, tr2, par. Multiple dialogs have missing *.ags files -> fixed fixed file y001kch1_003_AAF.nis to y001kxx1_003_AAF.nis 10.06.16 : extended VM1 corpus by a emuDB component to enable emuR usage; this includes the addition of *.wav files parallel to *.nis, and the addition of complete emuDB structure (/vdata/BAS/VM1_total_emuDB/VM1_emuDB/). EmuDB does not contain segments marked as elisions in the SAP tier, because the emuDB format does not allow negative durations. This emuDB is *not* part of the CD-R/DVD-R distribution but can be accessed only via the BAS CLARIN Repository (http://hdl.handle.net/11858/00-1779-0000-0006-BF00-E) Bug fixes due to this extension in BPF files (*.par) : SAP tier : all duration values reduced by 1 (was *not* according to BPF standard; segments did overlap by 1). Bug fixes in 5 SAP tiers : additional overlaps corrected. MAU tier : Minor (syntactic) bug fixes in three MAU tiers.