#####################################################################
################### README Subtitle Producer ########################
#####################################################################

This program creates ready-for-use subtitles from text-to-speech
alignments. 

As input, it requires an original transcription (i.e. a plain text 
file) and a BAS Partitur file with (at least) an orthographic tier 
(ORT) and a MAUS alignment tier (MAU). 

As output, it produces SubRip (srt) or SubViewer (sub) subtitle
formats. Alternatively, it returns a BPF (par) with an additional
tier TRO, which corresponds to the whitespace tokenized 
transcription, which is linked to the ORT tier.

#####################################################################
############################ Parameters #############################
#####################################################################

--bpf <file>

BPF file that contains at least the tiers ORT and MAU as well as the
sampling rate tag SAM. Example:

SAM: 16000
LBD:
ORT: 0 eight
ORT: 1 months
ORT: 2 ago
ORT: 3 mister
ORT: 4 smith
...
MAU:	0	2645	-1	<p:>
...

#####################################################################

--transcription <file>

Plain text file that contains the original transcription in plain 
text. Contrary to the BPF ORT tier, this transcription may contain 
punctuation marks, unnormalized word forms etc.

Example:

8 months ago, Mr. Smith went on holiday

#####################################################################

--outfile <file>

Target for the output file (e.g. <my_dir>/<my_stem>.srt)

#####################################################################

--tier [TRO|ORT] (default: TRO)

Tier to use for subtitle generation. Options:
TRO:	Align transcription with ORT tier to generate TRO tier and 
	then generate subtitle from there.
ORT: 	Generate subtitle from ORT tier, ignore transcription. 

--tier interacts with the following options:

--outformat 	If tier==ORT, only srt and sub are supported.
--marker 	If tier==ORT, only 'tag' is supported. This is 
		because newlines and punctuation markers are 
		suppressed in the ORT tier.
--transcription If tier==ORT, the transcription is ignored.

#####################################################################

--outformat [srt|sub|bpf|bpf+trn|trn] (default: srt)

Output format. <srt> and <sub> are text-based subtitle format accepted
by a number of audio and video players. <bpf> returns a BPF file with
an additional TRO tier. Example:

TRO: -1 ...
TRO: 0 a
TRO: 1 few
TRO: 2 months
TRO: 3 ago,
TRO: -1 ...\n

<bpf+trn> returns the same file with an appended TRN tier. The
segments of the TRN tier correspond to the subtitle. Example:

TRN: 0,1,2,3 236376 94813 ... a few months ago, ...\n

#####################################################################

--marker [punct|newline|tag] (default: punct)

Marker used for splitting transcription into subtitles. This parameter
is ignored if the output format is <par>.

<punct>: Transcription is split at periods (.), exclamation marks (!),
question marks (?) and colons (:).

<newline>: Transcription is split at newlines (\n or \r\n)

<tag>: Transcription is split at the tag <SUB_BREAK>. This tag must
be set inside the transcription by the user.

#####################################################################

--windowsize <INT> (default: 50)

Window size used during alignment. A big window minimizes the risk
of faulty alignments, but it increases runtime. 100 should be big 
enough in all cases that do NOT have long mismatches between the
transcription and the BPF ORT tier.

#####################################################################

--maxlength <INT> (default: 0)

Maximum subtitle length. If set to zero, subtitle length is 
determined solely by the distance between subtitle split markers 
(see parameter --marker). If set to a value greater than zero, 
subtitles with a length greater than that value are split.

#####################################################################

--verbose

Set this flag for information on progress.

#####################################################################

--no_clean

Set this flag to keep temporary files after the process is done.

#####################################################################
############################### Test ################################
#####################################################################

Run the test executable in this directory. It will turn the Text-BPF
pairs in the Bench directories into all target formats. Note that
at this point we only check that the process runs without an error,
not that it produces the correct output. Thus, you might want to
check some files manually in Bench/Out.
