Speaker Verfication Projects at the BAS

Current Projects for Speaker Verification

Here you find information about our current baseline speaker verification system, a link to download the first stable version and more about the on-going work. Feel free to contact us:

Currently available:

Poster on the Eurospeech 2003:

Download: poster_Eurospeech03.pdf (63k)
Download: paper_Eurospeech03.pdf (93k)

MASV -- an experimental speaker verification system
(published under the GNU GPL)

MASV stands for Munich Automatic Speaker Verification. The current experimental system builds on the HTK tools and MATLAB. Many functions of the MATLAB code might runs also on the available clones (Octave, ...) but this has not been tested so far.
Comments on the current system are welcome!

This experimental system was primarily designed for my work with the VERIDAT database (unfortunately not freely available). In the last time, I redesigned the system to make it more flexible and thus useable for other people using different databases. Note that there are some functions which need modification when using a different database.

There is a short documentation already available:
MASV_description_Rel-1.3.pdf(16.02.2004) (600k)

Some functions of MASV need a patched version of HTK. Have a look at http://www.phonetik.uni-muenchen.de/Mitarbeiter/tuerk/htkTips/htkTips.html. If you like to create multi-mixture models with the standard procedure (generated by the MASV tool run_MASV_experiment.pl) you should at least apply the HHEd-patch.

Releases of the MASV system (Munich Automatic Speaker Verification system)

New Release from 02.06.2004:
Fixed some bugs (mainly in adding plots to existing figure).
Download: MASV_1.4.01.tgz (02.06.2004) (684k)
Download: dummy_HTK_Rel-1.4.tgz (27.05.2004) (16k); template directory structure for experiments

There are two Matlab packages for MASV ("voicebox" by Mike Brookes and "matdraw" by Keith Rogers) which are used by some functions. The matdraw package was slightly extended.
You can download an archive here:
Matlab_Packages.tgz (30.01.2004) (624k)
The installation is quite similar to the procedure described in the MASV documentation. Please make sure that the directories of both packages are in your Matlab path.

You might also be interested in this tar ball in order to get an idea of the VERIDAT database and its file structure:
Download: dummy_VD.tgz (12k); dummy template for a database

To-Do:

add description of Matlab functions
add step-by-step tutorial for a simple example

Sample data sets

Format: binary (due to data size), big endian; consists of concatened entries with 18 bytes length of the following format:

model_id: uint8 ( 120 different model ids in total, ids can be between 1 and 150)
spk_id: uint8 ( can be any of the spks with id 1 to 150)
cdf of FA/FR: float32 (in percent, depending on type_of_test (see next point) gives the error (FR for genuine test, FA for impostor test))
type_of_test: c_string ( 1 char, "I" (impostor test) or "G" (genuine test), terminated with binary 0)
score: float32 (matching score of model to utterance, higher score -> better match)
session_id: c_string (2 chars, the session number of the recording, terminated with binary 0)
recording_id: c_string (2 chars, item number in recording session; same item -> same utterance, terminated with binary 0)

The datasets (download using right mouse button menu):

GMM, 4 mixture, 30 fixed clients, 60 fixed impostors, all impostors used for each client (GMM_4_60_imps_all_imps.dat.gz ; 2.5M)
GMM, 4 mixture, 30 fixed clients, 60 fixed impostors, same gender impostors, (GMM_4_60_imps_same_gender_imps.dat.gz ; 1.2M)
GMM, 4 mixtures, 120 fixed clients, cross validation with remaining clients as impostors, (GMM_4_120_spks_cross_valid_imps_all_imps.dat.gz ; 20M)
GMM, 4 mixtures, 120 fixed clients, cross validation with same gender impostors from remaining clients, (GMM_4_120_spks_cross_valid_imps_same_gender_imps.dat.gz ; 10M)

Last Change: U. Türk, 27.05.2004