BAS Bavarian Archive for Speech Signals University of Munich, Institut of Phonetics Schellingstr. 3/II, 80799 Munich, Germany bas@phonetik.uni-muenchen.de COPYRIGHT University of Munich 1995. All rights reserved. This corpus and software may not be disseminated further - not even partly - without a written permission of the copyright holders. Additional Copyright Holders ========================================================================== PhonDat Data Format - Description F. Schiel 03.03.95 / 11.01.96 ========================================================================== FOR THE FAST READER: Use the ANSI C function in softw/header.c and data.c to read PhonDat files. PHONDAT FILE FORMAT Speech files archived in the PhonDat format have the following structure: PhonDat Format 1 Bytes Contents 1-512 Fixed length binary header structure, compatible with ILS format 513-EOF nspbk blocks of 512 bytes, containing 256 short samples each, Intel format. PhonDat Format 2 Bytes Contents 1-512 Fixed length binary header structure, compatible with ILS format 513-X multiple of 512 Bytes, containing the orthography and the so-called canonical form of the utterance. Struture: 512 - 515 : o r t \0 516 - 516+no : (no bytes orthography) 516+no+1 - 516+no+12 : \0 o e n d : : \0 k a n \0 516+no+13 - 516+no+13+nc : (nc bytes canonical form (SAM-PA) 516+no+13+nc+1 - 516+no+13+nc+8 : \0 k e n d : : \0 516+no+13+nc+9 - X : filled up with \0 X+1-EOF nspbk blocks of 512 bytes, containing 256 short samples each, Intel format. Fixed length part of PhonDat header (version 1 and 2) Contains information about the filestructure, the data type, the recording conditions and the speaker. Use functions from softw/header.c for reading or writing of PhonDat headers. The structure is as follows (C notation): typedef struct { long not_used1[5], nspbk, /* # of data blocks (512 bytes) */ anz_header, /* # of header blocks (512 bytes) */ not_used2[5]; char sprk[2]; /* speaker id */ short swdh; /* session repetition */ long ifld1[3], /* ILS */ not_used3[6]; char kenn1[2]; /* ILS text characters 1 - 8 */ short not_used4; char kenn2[2]; short not_used5; char kenn3[2]; short not_used6; char kenn4[2]; short not_used7; long not_used8[35], isf, /* sampling rate in Hz */ flagtype, /* ILS: -32000 if sampling file -29000 if param. file */ flaginit; /* ILS: 32149 if init */ char ifl[32], /* filename (terminated by /0) */ day, /* # of day */ month; /* # of month */ short year; /* # of year */ char sex, /* sex of speaker */ version; /* header version: 0 = extended ILS 1 = phondat version 1 2 = phondat version 2 */ short adc_bits, /* resolution of adc */ words; /* # of words in text */ long not_useda[50]; short wdh, /* # of repetition */ abs_ampl; /* maximum of amplitude */ char not_usedb[10]; } Phon_header_2; Variable lenght part of PhonDat header (version 2 only) The orthographic string contains the standard orthography or a transliteration with additional markers of the spoken utterance. German Umlauts are represented either by LaTeX convention or by 7 bit ASCII signs or by German Character set coding used by DEC and Sun: Umlaut LaTeX 7 Bit ASCII (dec) German Char Set (hex) Ae "A [ (91) C4 Ue "U ] (93) CD Oe "O \ (92) D6 ae "a { (123) E4 ue "u } (125) FC oe "o | (124) F6 ss "s ~ (126) DF The canonical string contains the exspected citation forms of the word in the utterance. Note that this is NOT a transcription of the signal. Symbols used are the German subcorpus of the SAM-PA, with following changes to SAM-PA: Q Glottal Stop q Glottalisierung (not in canonical forms!) ' main stress " secondary stress # compositum marker (optional) + function word marker (suffix, optional) Words are seperated by two blanks, phonemic labels are seperated by one blank.