BAS Infrastructures for Technical Speech Processing
What is and what does BITS do?
BAS Infrastructures for Speech Processing is an initiative for the creation of a scientifically and technologically well-founded platform for resources of spoken German language which will be opened to every user on the internet. The Bavarian Archive for Speech Signals (BAS) was founded as a public institution in January 1995 for research in the sector of spoken language processing and is hosted by the University of Munich, presently at the Institute of Phonetics and Speech Communication. One big benefit of the whole BITS project is the reduction of work to a relative minimum regarding the empirical collection of speech data. Through BITS a centre will be created which helps to support scientific and technical efforts for the processing of spoken language. Under the leadership of COLLATE (DFKI Saarbrücken) the project is part of the Competence Network for Speech Technology. The entire project will run for 4 years. During this period the aim is to create a basis platform for other projects largely financed by the industry.Provisional schedule for 9 separate projects:
(Duration of each project after beginning)
The scientific and technical aims are divided into 9 thematically defined sub-projects:Project 1:Contact Persons:
Methodology for the production of reusable corpora (months 0-6)
[Formulation of design-principles also suitable to future projects]
Because of their highly specific nature speech resources often 'age' quickly and become unsuitable for future use. Accordingly in this project design-principles will be worked out to make corpora suitable for other tasks. By expanding slightly the material produced by each speaker, the increased expense will just be small.
Validation methodology for formal and content (months 0-6)
[Formulation of definitions for checklists/test methods and draft of a seal of approval for evaluating a resource]
The above-named techniques will help customers to identify the right corpora for their special requirements. Results which were worked out on the basis of these methods will be shown in the BAS-catalogue.
Validation of speech data via internet (months 0-18)
[Development of tools for decentralised annotations via internet]
Annotations of speech material normally are the most expensive part of any corpus production. To make this process most timesaving and effective, it is now possible to perform much of the work platform-independently and decentralised (as already realised in WWWTranscribe and SpeechDat). In project 3 tools for producing annotations decentralised via internet will be developed and provided. The result of any annotation will immediately be sent to the server for further processing.
Data collection via internet (months 0-24)
[New methods for wide-area speech data collection with client-software on private computers]
A new possibility for collecting data material area-wide is to install client-software on private computers in all relevant regions. These direct speech data will be sent directly through the internet to the central computer of our institute. Speakers are provided with a suitable microphone and log on to the server. Temporary signal files then will be sent to the server in form of data packages. This procedure makes a high quality signal processing possible, with the effect to get speech signals made under realistic environmental conditions. The prompts will be sent through the net, so that costs for expensive mail sending will be saved. Speaker data will be administered centrally, giving a better overview of the whole project.
Re-evaluation of the BAS-catalogue (months 18-48)
[Systematic re-processing of the old BAS speech resources based on the principles established in project 2, followed by release in the BAS-catalogue]
Many customers find it difficult to know what kind of speech is best for their own purposes. In project 5 older speech resources in BAS will be validated systematically with in project 2 worked out principles. Results of the re-evaluation will be published in the BAS-catalogue.
Automatic analysis of speech corpora (months 18-48)
[Development of new techniques and tools for the automatic/semiautomatic creation of annotations/segmentation]
The value of every corpus is determined by its accessory information (e.g. documentation, speaker profiles, segmental information). As annotations by hand are very expensive, new processes and tools for the automatic/semiautomatic creation of annotations will be developed in project 6. Further the MAUS-principle, the automatic transliteration and the verification of read speech signals will be developed forwardly. The results will be generated in the BAS Partitur Format allowing integration into already existing data material in BAS.
Market research for technical speech applications (months 6-18)
[Interviews with experts for the estimation of German speech resources in the next 5 to 15 years]
In cooperation with industrial partners of the Competence Centre for Speech Technology experts will be asked to estimate the need for speech resources in the next 5-15 years. A recommendation catalogue with a special ranking of needed speech corpora will be created and published by BAS.
Corpus for speech synthesis (months 6-48)
[Planing, annotation and segmentation of an universally available corpus with professional speakers for concatenative speech synthesis]
Corpora with professional speaker voices in German are not available without a license. For a concatenative speech synthesis a generally available corpus with professional voices will be created to make competitive work feasible for institutes and SMEs. First goal is to design a widely utilisable synthesis corpus. The results will be annotated/segmented.
Archive for adolescent speech (months 24-48)
[Upgrading of the corpus Regional Variants of German with the addition of children and adolescent speech from all German speech regions]
No speech data for children and adolescents are publicly available at the moment, though these groups are important target groups for industry. The already existing corpus Regional Variants of German will be extended through the speech of children and adolescents of all regions and through better suited prompts for special applications. Institutes in each dialect region will be sought as partners to perform data collections. In return they will receive technical support and a proportion of the license fees.
1. Competence Centre for Speech Technology (COLLATE): Prof. Wahlster, Prof. Uszkoreit - Univ. Saarbrücken
2. BITS: Dr. Florian Schiel, , , - Univ. Munich
Written by Angela Baumann: 20/03/02