Bavarian Archive for Speech Signals

BITS  LOGATOME  CORPUS


Introduction

The BITS Logatome Corpus consists of a set of 11180 logatome recordings. covering all German and most of the important German - English/French diphone combinations. The recorded speakers are 4 professionals (2m/2f), the recorded signals include a close range and a long range microphone as well as the larygographic signal of the speaker. The annotation is done manually and provides phonetic classes as well as the exact boundaries of the phonemes.

Speech synthesis using concatenative techniques is maturing to a point where standard procedures are being implemented in a variety of products.

However, because of the considerable costs most small and medium-sized companies as well as university labs cannot afford to produce the required speech resources on their own. Although there are some public domain German diphone voices available for research purposes (e.g. MBROLA) there is definitely a lack of a professional and publicly available German synthesis resources.

The BITS synthesis corpus recorded and produced by BAS fills this obvious gap. The production of this resource (BITS TP8) was 100% funded by the German Ministry of Education and Science (grant no 01 IV B01).