Use case: We have a set of recordings, each of which we have an orthographic transcript for.

What we have: 50 recordings of Hungarian (*.wav) and 50 orthographic transcripts (*.txt)

What we want: 50 hierarchical segmentations (words, phones encoded in IPA) in a praat compatible format (*.TextGrid)

Solution: Web Interface Pipeline


* go to http://clarin.phonetik.uni-muenchen.de/BASWebServices

* go to Pipeline

* upload Hungarian/Corpus/* (drag&drop all files to the upload area and press 'Upload')

* Select the following options (all other options on default):

    Pipeline name = G2P_MAUS
    Language = Hungarian (HU)

  Expert Options (click on 'Expert Options' to see these):

    Pre segmentation = true
    Output Symbols = ipa

* confirm 'terms of usage' and click 'Run Web Service'
  The log area should turn green; if not green check ERROR/WARNING messages

* download the resulting *.TextGrid by clicking on "Download as ZIP-File", or

* for direct inspection: click on one *.TextGrid result link, and then click 
  on the EMU webApp symbol. A new Window should open in your browser and show you
  the signal and the resulting segmentation/labelling.
