Speaker recruitment turned out to be the single most critical issue. None of the project partners had experience with such a large speech database collection. Project partners with a good geographic and demographic coverage among their employees found it relatively easy to motivate their employees to participate - examples are national telecom companies. Professional market research companies in general were not used because of the high cost - e.g. in Germany they asked for more money than was available for the entire German data collection - and the lack of a guarantee that they would provide the requested number of speakers.
Most SpeechDat-II databases were ready for validation at about the same time. This imposed a heavy workload on the validation agency; originally it was planned to deliver the databases in sequence so that their validation could proceed with a constant effort over a longer period of time. During the validation grave errors were found in some databases. These errors had to be corrected, either by recording additional material, re-annotation or re-creation of lexica. In some cases, not all errors could be corrected and the database had to undergo an acceptance vote. The most important lesson learned here was that there should be at least three validations: a formal validation of all prompt material prior to any recordings, an early validation of the first few recordings prior to the main recording phase, and a final validation. For very large databases, an intermediate validation is very useful.
SpeechDat has effectively set the standard for many successor projects. It is a show case for the collaboration of academia and industry, and it has proved that direct market competitors can effectively share the effort creating resources while at the same time keeping up the competition for the development of devices, applications and services.