Bavarian Archive for Speech Signals

BITS  UNIT SELECTION  CORPUS


Introduction

The BITS Unit Selection Corpus consists of a set of 6732 annotated sentence recordings for concatenative 'unit selection' speech synthesis. The recorded speakers are 4 professionals (2m/2f), the sentence corpus should cover all German diphones as well as the most common German-English and German French diphone combinations. The recorded signals include a close range and a long range microphone as well as the larygographic signal of the speaker. The annotation provides a manual phonetic labelling and segmentation of the complete corpus as well as a prosodic labelling of accents and boundary tones in a subset of GToBI.

Speech synthesis using concatenative techniques is maturing to a point where standard procedures are being implemented in a variety of products.

However, because of the considerable costs most small and medium-sized companies as well as university labs cannot afford to produce the required speech resources on their own. Although there are some public domain German unit selection voices available for research purposes there is definitely a lack of a professional and publicly available German synthesis resources.

The BITS synthesis corpus recorded and produced by BAS fills this obvious gap. The production of this resource (BITS TP8) was 100% funded by the German Ministry of Education and Science (grant no 01 IV B01).