1 Objective
2 Preliminaries and starting up R
3 Converting Praat TextGrids
4 Calculating pitch with wrassp
5 Adding the calculated pitch files to the database
6 Displaying the pitch files in the webapp
7 Adding an event tier
8 Labelling some tones
9 Automatically linking event and segment times

1 Objective

The aim is to get from a Praat TextGrid to an Emu database format as exemplified by Fig. 1.1:

Figure 1.1: An utterance fragment in Praat and in Emu

2 Preliminaries and starting up R

The assumption is that you have a project called emu2021 and that it contains the following directories.

If not, please see 1. Preliminaries here

Start up R in the project you are using for this course.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(emuR)

## 
## Attaching package: 'emuR'

## The following object is masked from 'package:base':
## 
##     norm

library(wrassp)

In R, store the path to the directory testsample as sourceDir in exactly the following way:

sourceDir = "./testsample"

And also store in R the path to emu_databases as targetDir:

targetDir = "./emu_databases"

3 Converting Praat TextGrids

The directory /testsample/praat on your computer contains a Praat style database with .wav files and .Textgrid files

Define the path to this database in R and check you can see these files with thenlist.files() function:

path.praat = file.path(sourceDir, "praat")
list.files(path.praat)

##  [1] "wetter1.TextGrid"  "wetter1.wav"       "wetter10.TextGrid"
##  [4] "wetter10.wav"      "wetter11.TextGrid" "wetter11.wav"     
##  [7] "wetter12.TextGrid" "wetter12.wav"      "wetter13.TextGrid"
## [10] "wetter13.wav"      "wetter14.TextGrid" "wetter14.wav"     
## [13] "wetter15.TextGrid" "wetter15.wav"      "wetter16.TextGrid"
## [16] "wetter16.wav"      "wetter17.TextGrid" "wetter17.wav"     
## [19] "wetter2.TextGrid"  "wetter2.wav"       "wetter3.TextGrid" 
## [22] "wetter3.wav"       "wetter4.TextGrid"  "wetter4.wav"      
## [25] "wetter6.TextGrid"  "wetter6.wav"       "wetter7.TextGrid" 
## [28] "wetter7.wav"

The emuR function for converting the TextGridCollection to an Emu database and then storing the latter in targetDir (defined above) is convert_TextGridCollection(). It works like this:

# only execute once!
convert_TextGridCollection(path.praat, 
                           dbName = "praat",
                           targetDir = targetDir)

The converted Praat database can now be loaded:

praat_DB = load_emuDB(file.path(targetDir, "praat_emuDB"))

## INFO: Checking if cache needs update for 1 sessions and 14 bundles ...
## INFO: Performing precheck and calculating checksums (== MD5 sums) for _annot.json files ...
## INFO: Nothing to update!

and its properties examined as before:

summary(praat_DB)

And it can of course be viewed:

serve(praat_DB, useViewer = F)

4 Calculating pitch with `wrassp`

The task is to calculate the pitch from each of the utterance’s waveforms for the praat_DB database created above. First, find the full path names of all of the .wav files. They are here:

praat_wav_paths = list.files(path.praat, pattern = ".*wav$", recursive = T, full.names = T)
praat_wav_paths

##  [1] "./testsample/praat/wetter1.wav"  "./testsample/praat/wetter10.wav"
##  [3] "./testsample/praat/wetter11.wav" "./testsample/praat/wetter12.wav"
##  [5] "./testsample/praat/wetter13.wav" "./testsample/praat/wetter14.wav"
##  [7] "./testsample/praat/wetter15.wav" "./testsample/praat/wetter16.wav"
##  [9] "./testsample/praat/wetter17.wav" "./testsample/praat/wetter2.wav" 
## [11] "./testsample/praat/wetter3.wav"  "./testsample/praat/wetter4.wav" 
## [13] "./testsample/praat/wetter6.wav"  "./testsample/praat/wetter7.wav"

The signal processing package wrassp will now be used to calculate the pitch for each of these .wav files. To see the full range of signal processing routines available, enter:

?wrassp

There are two possible routines that are needed here for calculating pitch: ksvF0 and mhsF0.

Here’s how to use mhsF0 with the default settings. The output is going to be stored in path.praat (i.e. in /testsample/praat on you computer).

# only execute once!
mhsF0(praat_wav_paths, outputDirectory = path.praat)

As the figure below shows, the pitch files have should now all been dumped in path.praat i.e. in /testsample/praat

$~$

5 Adding the calculated pitch files to the database

These calculated pitch files now need to be added to praat_DB. This is done with the add_files() function. The parameter targetSessionName can be omitted in this case, because all of the bundles are stored in the session directory 0000. This can be verified with:

list_bundles(praat_DB)

## # A tibble: 14 × 2
##    session name    
##    <chr>   <chr>   
##  1 0000    wetter1 
##  2 0000    wetter10
##  3 0000    wetter11
##  4 0000    wetter12
##  5 0000    wetter13
##  6 0000    wetter14
##  7 0000    wetter15
##  8 0000    wetter16
##  9 0000    wetter17
## 10 0000    wetter2 
## 11 0000    wetter3 
## 12 0000    wetter4 
## 13 0000    wetter6 
## 14 0000    wetter7

Now add the pitch files to praat_DB:

# only execute once!
add_files(praat_DB, 
          dir = path.praat, 
          fileExtension = "pit", 
          targetSessionName = "0000")

Having added the files, they need to be defined. The information required is:

a track name. This can be anything and it is needed when referring to these signal files in R.
the file extension. This is pit as already established above.
the columnName. This is the name of the column in the .pit files in which the fundamental frequency data is stored. This type of information (as well as information about the extension) is given by wrasspOutputInfos. In this case, append $mhsF0 since this was the name of the signal processing routine that has been used to calculate the pitch data:

wrasspOutputInfos$mhsF0

## $ext
## [1] "pit"
## 
## $tracks
## [1] "pitch"
## 
## $outputType
## [1] "SSFF"

The column name is given by $tracks which in this case is pitch. Putting all this together, and using "pitch" for the the name of the track gives:

# only execute once!
add_ssffTrackDefinition(praat_DB,
                        name = "pitch",
                        columnName = "pitch",
                        fileExtension = "pit")

summary(praat_DB)

6 Displaying the pitch files in the webapp

The signals that are currently displayed for this praat_DB database can be seen with the function get_signalCanvasesOrder() as follows:

get_signalCanvasesOrder(praat_DB, perspectiveName = "default")

## [1] "OSCI" "SPEC"

which confirms that what is seen when viewing the database with the serve() function is the waveform (OSCI) and the spectrogram. The pitch data created above now needs to be added using the function set_signalCanvasesOrder. The second argument should always be "default", thus:

set_signalCanvasesOrder(praat_DB, perspectiveName = "default",
                        order = c("OSCI", "SPEC",  "pitch"))
serve(praat_DB, useViewer = F)

7 Adding an event tier

The next task is to add an event tier that can be used for labelling tones. Here the tier is called “Tone”. So far, the only only existing time tier is ORT as confirmed by:

list_levelDefinitions(praat_DB)

In order to add a new tier called Tone as an EVENT tier:

# only execute once!
add_levelDefinition(praat_DB, "Tone", "EVENT")

Display Tone so that it is above the ORT tier and so directly underneath the signals:

get_levelCanvasesOrder(praat_DB, perspectiveName = "default")

set_levelCanvasesOrder(praat_DB, 
                       perspectiveName = "default", 
                       order = c("Tone", "ORT"))

8 Labelling some tones

Add two tone labels H* at pitch peak of morgens and ruhig in wetter1 as in Fig. 1.1 and save the result.

serve(praat_DB, useViewer=F)

The tones are to be linked to words within which they occur in time. To do this, define a hierarchical relationship such that ORT dominates Tone:

list_linkDefinitions(praat_DB)

# only execute once!
add_linkDefinition(praat_DB, 
                   type = "ONE_TO_MANY", 
                   superlevelName = "ORT", 
                   sublevelName = "Tone")

list_linkDefinitions(praat_DB)

## NULL

Inspect the hierarchy:

summary(praat_DB)

# switch to hierarchy view
serve(praat_DB, useViewer = F)

9 Automatically linking event and segment times

This makes use of the autobuild_linkFromTimes function in order to link the tones to the corresponding words:

# only execute once!
autobuild_linkFromTimes(praat_DB,
                        superlevelName = "ORT",
                        sublevelName = "Tone")

# switch to hierarchy view
serve(praat_DB, useViewer = F)

Converting a Praat TextGrid collection

Jonathan Harrington

WiSe 2021

1 Objective

2 Preliminaries and starting up R

3 Converting Praat TextGrids

4 Calculating pitch with `wrassp`

5 Adding the calculated pitch files to the database

6 Displaying the pitch files in the webapp

7 Adding an event tier

8 Labelling some tones

9 Automatically linking event and segment times

Converting a Praat TextGrid collection

Jonathan Harrington

WiSe 2021

1 Objective

2 Preliminaries and starting up R

3 Converting Praat TextGrids

4 Calculating pitch with wrassp

5 Adding the calculated pitch files to the database

6 Displaying the pitch files in the webapp

7 Adding an event tier

8 Labelling some tones

9 Automatically linking event and segment times

4 Calculating pitch with `wrassp`