Decoding-graph creation recipe (test time)
Here we explain our normal graph creation approach step by step, along with certain data-preparation
stages that are related to it.
Most of the details of this approach are not hardcoded into our tools; we are just explaining how it is
currently being done. If this section is confusing, the best remedy is probably to read "Speech
Recognition with Weighted Finite-State Transducers" by Mohri et al. Be warned: that paper is quite
long, and reading it will take at least a few hours for those not already familiar with FSTs. Another good
resource is the OpenFst website which will provide more context on things like symbol tables.
Preparing the initial symbol tables
We need to prepare the OpenFst symbol tables words.txt and phones.txt. These assign integer id's to all
the words and phones in our system. Note that OpenFst reserves symbol zero for epsilon. An example
of how the symbol tables look for the WSJ task is:
The words.txt file contains the single disambiguation symbol "#0" (used for epsilon on the input of
G.fst). This is the last-numbered word in our recipe. Be careful with this if your lexicon contains a word
"#0". The phones.txt file does not contain disambiguation symbols but after creating L.fst we will create
a file phones_disambig.txt that has the disambiguation symbols in (this is just useful for debugging).
Preparing the lexicon L
First we create a lexicon in text format, initially without disambiguation symbols. Our C++ tools will
never interact with this, it will just be used by a script that creates lexicon FST. A small part of our WSJ
lexicon is: