_/_/_/_/         _/_/         _/_/_/_/
                    _/      _/       _/ _/        _/      _/
                   _/      _/       _/  _/       _/
                  _/      _/       _/   _/       _/
                 _/_/_/_/         _/_/_/_/        _/_/_/
                _/      _/       _/     _/             _/
               _/      _/       _/      _/             _/
              _/      _/       _/       _/    _/      _/
             _/_/_/_/         _/        _/     _/_/_/_/


                   BAVARIAN ARCHIVE FOR SPEECH SIGNALS 

               University of Munich, Institut of Phonetics
               Schellingstr. 3/II, 80799 Munich, Germany
                      bas@phonetik.uni-muenchen.de


  COPYRIGHT Florian Schiel, University of Munich 1998. All rights reserved.   
    This corpus and software may not be disseminated further - not even
      partly - without a written permission of the copyright holders.  

                      Additional Copyright Holders

----------------------------------------------------------------------

Munich AUtomatic Segmentation (MAUS)

BAS Distribution Package   MAUS

2003-08-19 / 2019-02-19

----------------------------------------------------------------------

This package contains all necessary files to setup and run the 
'Munich AUtomatic Segmentation' (MAUS) system for automatic 
segmentation of spoken German developed at the Bavarian Archive
for Speech Signals (BAS) at the University of Munich, Germany.

MAUS is capable to automatically segment a recorded speech signal 
into a set of phonemic classes. It uses a automatically derived,
statistical rule set about pronunciation to calculate 
a vast number of hypothesis about the given utterance and then 
aligns the signal to this hypothesis space finding the phonemic
transcript together with the segmentation that yield the highest combined 
probability of acoustical match and statistical prediction.

For more details about the MAUS principle please refer to the 
publications given in hdl.handle.net/11858/00-1779-0000-0028-421B-4

MAUS is no longer distributed with full language support (only German), since the
web-based MAUS system is now operational
(https://clarin.phonetik.uni-muenchen.de/BASWebServices/)
for several years - and also we encountered a number of illegal re-distributions of resources
contained in the package by third parties.

----------------------------------------------------------------------

To use MAUS you need a PC running Linux (any variant will probably do it),
the HTK software package from Cambridge, UK (publicly available at 
htk.eng.cam.ac.uk), standard tools csh, sox, awk, xpath, ffmpeg and this package.
There are several technical constraints on the usage of MAUS that are 
listed together with a short user manual in the file USAGE.

Hints for the setup of MAUS are given in the file INSTALL.

----------------------------------------------------------------------

Some explanations to important files in this package. Files not listed
here are probably helper scripts called by the main programs:

Subdir DOCU/    : development history of maus and other tools
                  and other helpful texts
   DOCU/HISTORY    : main version history for maus and helpers
   DOCU/USAGE      : some useful tips for usage/adaptation of maus
   DOCU/CORPUSREQUIREMENTS
                   : the requirements for a speech corpus to be used as MAUS
   		  trainings corpus (e.g. for a new language in MAUS)
   DOCU/WebMAUSInfo.txt 
                   : infos to the web application WebMAUS and webservices
   DOCU/SYMBOLSETS : rules about symbols sets of maus
Subdir EXAMPLES/: data to test MAUS (see CHECK/)

Main Tools (marked with '*' are needed for webservices) :

ShowLattice     : visualization tool for the SLF files (not needed by the 
                  main script but useful to see what MAUS is hypothesizing)
maus(*)	        : the main script; read the help output produced when started
                  without options carefully
maus.corpus     : wrapper to perform maus on a whole corpus of signals
maus.iter	: iterative maus (see HISTORY.ITERATIVE for details)
maus.trn(*)     : wrapper to exploit a chunk segmentationen coded in the TRN tier
maus.web        : wrapper to use webservice runMAUS instead of local installation
                  this is useful if you cannot install MAUS but want to use the script
txt2par		: simple tool to create a primitive BPF input file from a table
Chunker(*)      : Nina Poerner's chunker service
Anonymizer(*)   : service to anonymize media files and annotations
Subtitle(*)     : Nina Poerner's sutitling service
AudioEnhance(*) : signal process media files (for optimal use in services)
Pipeline(*)     : the BAS WebServices pipeline scripts
AnnotConv(*)    : universal converter service for annotation files
EMUMagic(*)     : pipeline scripts for the EMUMagic Service: produces an optimized emuDB 
                  from arbitrary input
Asr(*)          : Automatic Speech recognition scripts that call third-party services
SpeakDiar(*)    : the speaker diarization service
testEnhance(*)  : normalize *.txt files for optimal input to G2P

Helper tools (usually not used alone) :

HVite,HCopy,HHEd
                : selected HTK tools pre-compiled for Ubuntu
word_var        : helper tool to generate the statistical weighted search graph
                  C+++ sources in word_var.src
wav2trn         : helper program to pre-segment a signal by leading and trailing
                  silence to improve the MAUS result
rec2mau.awk     : helper to convert HTK output to CSV table in BPF style (MAU tier)
kan2mlf.awk     : helper to convert a BPF tier KAN in HTK style MLF
kan2hmm.awk     : helper to convert a BPF tier KAN in HTK style HMM names
mau2Text2Grid   : helper program to convert a MAU tier in a BPF file to a TextGrid file		  
mausbpfDB2emuRDB: helper program to convert a collection of MAUS created BPFs into emu DB
mausbpf2emuR    : helpers for mausbpfDB2emuRDB: converts BPFs *.par to emu DB *_annot.json
mausbpf2csv     : helper to convert BPFs *.par to spread sheet table *.csv
mausbpf2eaf     : helper to convert BPFs *.par to ELAN compatible *.eaf
mausbpf2exb     : helper to convert BPFs *.par to EXMARaLDA compatible *.exb 
mausbpf2tei     : helper to convert BPFs *.par to Iso TEI compatible XML (*.tei)
par2TextGrid    : simple converter from BPF to praat Textgrid
par2emu         : simple converter of BPF to legacy Emu *.hlb + *.phonetic files
xpath           : simple XML parser
check_param_sets: checks the language parameter sets for syntac errors

Libraries/parameter files:

ipkclib/        : C library used by word_var		  
PARAM/		: parameter set for classical MAUS
PARAM.MAN/	: parameter set for phonological rules
PARAM.<lang>	: parameter set for language <lang>
PRECONFIGWAV    : HTK config file for front end processing
HMM/            : tools to build new HMM sets for new languages
                  based on existing HMMs (not a real training!)


----------------------------------------------------------------------

DISCLAIMER

This package is public domain and provided 'as is' with no further support
or warranty. It is a scientific, experimental system and must not be 
used for any other purposes than scientific investigations and education.
If you intend to use this package or parts of it in a commercial enterprise,
please contact the copyright holders.
This package or parts of it may not be given to any third parties
without the explicit consent of the copyright holders.

----------------------------------------------------------------------

Binaries word_var, wav2trn are pre-compiled for usage on Linux.
To use on other OS see sources in *.src and the library ipkclib