Bavarian Archive for Speech Signals
SmartKom - SKP
Last update 2012-02-29 - selbe Seite in deutsch
The SmartKom corpora were produced at BAS in the years 1999 to 2003 within the
SmartKom project which was funded by the
German Ministry of Education and Science. The corpus consists of
multi-modal recordings ('sessions') of 224 persons in a Wizard-of-Oz setting.
More details regarding the specification and production can be found
here; an overview about the total corpus has been published at
Release SKP contains 172 recordings in the technical setup
('scenario') SmartKom Public which is comparable to a traditional
public phone booth but equipped with additional intelligent communication
devices. Naive users were asked to test a 'prototype' for a market study
not knowing that the system was in fact controlled by two human operators.
They were asked to solve two tasks in a period of 4,5 min while they
were left alone with the system. The instruction was kept to a minimum;
in fact the user only knew that the system is able to understand
speech, gestures and even mimical expressions and should more or less
communicate like a human.
Main technical features of release SKP
- Technical setup: Public (scenario)
- Primary domain 'Cinema'; secondary domain 'Restaurant'
- Primary domain 'Fax'; secondary domain 'Telephone, Email'
- 86 users
- 172 recording sessions; size: 580 GB
- Recorded modalities:
- Audio in 10 channels
- Video of face
- Video of upper body from the left
- Infrared video of the display area (to capture the 2D gestures) as input
to the SIVIT device (Siemens gesture recognizer)
- Video of the GUI output
- Coordinates of graphic tableau (when pen was used)
- Coordinates of SIVIT device (when finger/hands were used)
- Annotations (see READMEs below for details):
- a href="http://www.phonetik.uni-muenchen.de/forschung/SmartKom/Konengl/engltrans/engltrans.html">Transliteration
- 2D Gesture
- user statesin three modalities
- Turn segmentation
- Documentation, TechDoks and publications
- All annotations compatible to the 'BAS Partitur Format' (BPF)
Recording sessions : Overview
This table contains an overview about all SmartKom recording sessions
(many of them might not be contained on this release!). For each recording
one line with 35 TAB separated colums contains the following
data: session id and dvd number (1-2), recorded modalities (channels, 3-20),
annotations (21-26) and some selected features of the user.
As you can see not all recordings have the full range of modalities
or annotations (missing parts are marked with a '-'). The above table
is intended to simplify the selection of recording sessions for specific
Availablility and fees
Since the SmartKom corpus was produced with public fundings, there
are no restrictions to use the data except that the corpus as well
as parts of it must not be distributed explicitly to third parties
(the distribution of implicit data as statistical models, rules derived from
the data, etc. is permitted).
The corpus is structured into so called 'volumes' usually containing
on recording session on DVD-5 (4,3 - 4,7 GByte). To individually
select recording sessions for your order please use the above described table.
The fee for single volumes of this release is set to 1 BAS distribution fee:
SmartKom Single Volume
1 DVD-5 UDF + Postage + Packaging
EUR 255,65 (ELRA Members 50% discount)
The fee for the total release SKP is:
1 USB HD + Postage + Packaging
EUR 4.500,- (ELRA Members 3.000,-)
It is possible to order parts or selected channels from the distribution.
For example you might be interested only in the facial video channel
in combination with user state labeling or in the audio channels in
combination with the transliteration. The distribution fee is
caculated by the number of needed DVD-5 for distribution times
the BAS distribution fee.
Please also consider the special edition SKAUDIO with audio channels only.
Questions and orders to