The EST provides speech classes for most of the commonly used speech classes. In addition, there is set of classes for complex linguistic data structures.
This class is meant to take all the hassles out of waveforms...
Features include:
There are two constructors. EST_Wave()
is the default constructor, and
EST_Wave(const EST_Wave &a)
is the copy constructor.
i
, in channel c
. If
only one argument is given it is assumed that the first channel is the
one required. As this function returns a reference, it can be used a an
lvalue in an expression, e.g.
sig.a(i) = 42;Sets value i to be 42.
short (int i, int c) const
This use of the overloaded function operator returns the amplitude of
the waveform at point i
, in channel c
. If only one
argument is given it is assumed that the first channel is the one
required. This is defined as a const function and thus can't be used as
an lvalue.
short *data()
returns a pointer to the actual data.
void set_data(short *d,int num_samples,int sr,int nc)
when waveforms are generated by some other mechanism this allows data to
be taking into a wave class without copying. The data *d
will be
freed when the wave is destructed.
void set_data_copy(short *d,int num_samples,int sr,int nc)
as with the above but the data is copied first, thus it is the callers
responsibility to free the data in *d
.
The EST_Wave class has a considerable file i/o library and can read and write nist, ulaw, ESPS sd, CSTR vox, Sun/Next snd, Microsoft riff, Apple aiff and unheadered (raw) files.
load(const String &name, int offset=0, int length =0)
load file of type "name", performing automatic file type identification.
load_file(const String &name,file_format_t filetype,
int sample_rate, sample_type_t sample_type, int bo, int nc, int offset,
int length)
Load file of type "name" of type filetype.
save(const String &name, const String &filetype = "")
save_file(const String &name,file_format_t filetype,
sample_type_t sample_type, int bo, int offset, int length)
resize(int n)
Resize the waveform so that it will hold up to n points. Any existing
values in the waveform up to this point will be preserved and the
remainer (if any) will be set to 0.
resample(int rate)
Resample (and perform appropriate filtering) to make the waveform have
sample rate "rate".
rescale(float gain, int normalize=0)
Amplify or attenuate the waveform by a factor of gain
. A value of
1.0 will leave the waveform the same size. If normalize
is
1, the signal is maximized first, in this case the gain should
less than or equal to 1.0
clear()
Clear the waveform of all values, set size to 0.
num_samples()
Number of sample points (size) in wave.
int num_channels()
Number of channels
int sample_rate()
Sample rate in Hz
float t(int i)
Time position of sample i.
void set_num_samples(const int n)
Set number of samples to n.
void set_num_channels(const int n)
Set number of channels
to n.
sample_type_t sample_type()
Returns sample type (usually
short). This function returns a enum constant, found in
EST_wave_utils.h
. This can be converted to a string by calling
sample_type_to_str(...)
.
file_format_t file_type()
Returns type of file - nist, snd
etc. This function returns a enum constant, found in
EST_wave_io.h
. This can be converted to a string by calling
file_format_to_str(...)
.
set_sample_rate(const int n)
Set sample rate to n.
Some other useful (non-member) functions for the wave class are described.
wave_extract_channel(EST_Wave &single, const EST_Wave &multi, int
c)
Extracts a single channel, give by c
from waveform
multi
and puts it in single
.
wave_combine_channels(EST_Wave &combined, const EST_Wave &multi)
Combines all the channels in multi
into a single channel
waveform, combined
.
wave_sub_wave(EST_Wave &subsig,EST_Wave &sig,int offset,int length)
Extracts a portion of the original, sig
, and puts it in
subsig
. The new portion starts at sample offset
and is
length
samples in duration.
rateconv(short *in,int insize, short **out, int *outsize,
int in_samp_freq, int out_samp_freq)
Basic low level
function that performs sampe rate conversion. The resample(...)
member function should normally be used instead of this.
read_wave(EST_Wave &sig, const String &in_file, Option &al)
This serves as a wrap around function to EST_Wave.load()
. In the case
of raw data, it checks to see what sampling rate has been specified and
returns an error if this has been omitted.
wave_info(EST_Wave &w)
Prints out basic information about the
waveform.
The Track class is intended for represnting data as a function of time. Possible examples include Fo contours, cepstra, LPC parametes. Features include:
File i/o support for various file formats including ESPS FEA, htk, xmg, ascii, snns (Stuttgart neural net software) and xgraph.
Different algorithms require data to be in different file formats. For instance, some want the points evenly spaced, while others require a set of x/y values. The Track class provides the means to change easily from one format to another.
Breaks can be placed in the Track. A break at a point indicates that there is no value at that particular point. This can be used to indicate unvoiced regions in F0 contours and formants, for instance.
By default, Tracks have single channel, but any number of channels can be used. The channels all share the same timing and break information.
Some algorithms want their data in milli-second spacing, others in second spacing. By default all times in EST are in seconds, but functions exist which automatically provide a milli-second equivalent.
Consult the header file `include/EST_Track.h' for a full list of member functions.
The data is stored in three arrays: time, amplitude and value. The amplitude array stores the actual values, while the time array stores the times at which those values occur. All times are in seconds. In many applications, the points on the time access are fixed and thus this is somewhat redundant. However, some applications (such as pitch synchronous marking) require the points to be at uneven intervals and this is the main use of the time array.
The amplitude array is two dimensional, one array for the frame access, the other for channel access. By default, it is assumed that the Track has a single channel.
Often a data type has missing values. For instance in an Fo contour, there are no Fo values during unvoiced regions of speech. The value array is used to store this information. A value of 1 indicates that the track has a proper value at that point, a value of 0 indicates that there is no proper value. These 0 values are referred to as breaks.
Different algorithms require the data to be in different formats, and a general function change_type() is used for this. Two major variations are provided. The first converts the time axis to fixed interval spacing and the second allows the option of having as many breaks as there are time points, or only have one break between stretches of contour.
There are four constructor functions for the Track class:
Track()
default constructor.
Track(const String &name)
read a track from a file.
Track(const Track &a);
Copy constructor.
Track(int n)
Initialise an empty track of size n.
void resize(int num_frames, int num_channels)
Change the size of the track to
hold num_frames
frames and num_channels
channels. If
num_frames is greater that the current size, all values are preserved
and the extra frames and channels are padded with zeros. If n is
smaller, the values up to that point are preserved.
clear()
remove all values from the contour.
empty()
returns 0 if there are no values in the contour at all.
t(int i)
return a float representing the time at position
i in seconds. This function returns a referrence which means that this function
can be used for assignment - e.g. track.t(i) = 2.5;
ms_t(int i)
return a float representing the time at position
i in milli-seconds. This function returns a referrence which means that this function
can be used for assignment - e.g. track.t(i) = 100.0;
a(int i, int c=0)
return a float representing the amplitude at
position i, channel c. This function returns a referrence as above,
implying that it can be used as an lvalue in a expression. This is the
recommended way of writing a value in a track. If no second argument is
given, values in the first (zeroth) track are returned.
a(float i, int c=0)
return a float representing the amplitude nearest time t, given in
seconds. This function returns a referrence as above, implying that it
can be used as an lvalue in a expression. If no second argument is
given, values in the first (zeroth) track are returned.
operator()(int i)
return a float representing the amplitude at
position i. This function returns a const referrence, and cannot be used
as an lvalue. This is the recommended way of reading a value in a track.
operator()(float i)
return a float representing the
amplitude nearest time t, given in seconds. This function returns a
const referrence as above, and hence cannot be used as an lvalue.
it(int i)
return an int representing the time at position i.
ia(int i)
return an int representing the time at position i.
set_break(int i)
set a break point at position i.
set_value(int i)
set a non-break point at position i.
val(int i)
is position i a value?
track_break(int i)
is position i a break?
prev_non_break(int i)
find the first value before i which
is not a break.
next_non_break(int i)
find the next value after i which is
not a break.
num_frames()
number of frames in track
set_num_frames(int n)
Set the number of frames in the track.
fill_time(float t, int start =1)
Fill time axis at t intervals
sample(float shift)
A contour can be re-sampled at a any time
interval. The resultant contour is the same length in time, but with a
different number of points, now spaced at interval shift
change_type(float nshift, const String break_type)
Change the type of
the contour. nshift is used to resample the contour, which
automatically converts it to fixed time intervals. Set this to 0.0 if no
resampling is to be done. break_type can either be "MANY",
indicating that lots of break values can occur in a row, or "SINGLE",
indicating that only one break value is present between section sof real
contour.
shift()
finds current time intervals. Returns an error of
called on a type of contour which is not "FIXED".
start()
return time value of first non-break point.
end()
return time value of last non-break point.
void add_trailing_breaks()
Fills up the starts and end of
the contour with breaks values.
void rm_trailing_breaks()
Removes breaks from the
beginnings and ends of the contour.
void rm_excess_breaks()
Reduces all breaks between values
to a single break.
void set_space_type(const String &t)
If this is set to FIXED, the track is specified as a set of fixed
interval frames. If set to VARI, the position of the points is variable
and can only be determined from the t()
function.
void set_break_type(const String &t)
Set spacing to be SINGLE or MANY. In single mode there is a single break
between each set of values, in MANY mode, there is a break for every
time slot (frame) between sets of values.
void set_contour_type(const String &t)
Each contour has a name associated with it. This function is used to set
it.
String get_space_type()
Return a string, either "FIXED" or "VARI".
String get_break_type()
Return a string, either "MULTI" or "SINGLE".
String get_contour_type()
Return name of contour.
ref
For future use with stream based architecture.
sr
Sampling rate, sometimes it is convenient to keep this.
error
a flag indicating an error has occured in a member
function.
color
Color of contour.
amin
Minimum amplitude of track. Particulrly useful for
guiding plotting routines.
amax
Maximum amplitude of track.
A Stream_Item object contains information relating to a single label. The principal fields are name, relating to the identity of the unit (e.g. phoneme name), and end which is the end point in seconds.
There are a number of secondary fields which are used to store extra information and to allow items in one stream to be linked to items in another.
Consult `include/EST_Stream.h' for definitons and member functions.
Obviously one often needs to store more information about an item than simply its name and end point. There are various mechanisms within the EST_Stream_Item class which make provision for this.
The simplest method is to store the additional information using the
fields mechanism, which allows a single EST_String
to be
tagged to the EST_Stream_Item
. The set_field_names()
function
can be used to set this and the fields()
is used to read it.
If the literal
flag is set when the EST_Stream.load()
function is called on an xlabel file, any information following the
separator (";" by default) after the label name will be placed in the
fields variable.
Because you may wish to associate arbitrary information with a Stream Item we provide a method to storge arbitrrary data in a Stream Item.
The member function set_contents()
takes a void *
pointer
to data and a pointer to a garbage collection function that will
delete contents appropriately if called with the contents. Reference
counts are used to keep track of users of this contents, and
the garbage colleciton function is called only when the last
instantiation of an EST_Stream_Item
referencing the
contents is deleted.
For example suppose we wished to associate an EST_Wave
with
a stream item. We can define a gc function as
static void gc_wave(void *w) { delete (EST_Wave *)w; }
The following code will set the contents to a given wave
EST_Wave *w; EST_Stream_Item i; w = new EST_Wave(wave); i.set_contents(w,gc_wave);
Note that the given contents must not be deleted by any other destructor so here we copy the wave into a new wave before setting the contents.
The member function contents()
returns a void *
pointer
to the data which should be cast to what ever structure or class
is in the contents.
Note the reason we offer this rather than depend on some inherited class or name fields of specific classes in the stream item is that we wish to allow arbtrary class and structures to be held as contents without having to recompile the speech tools.
The allow a more general feature mechanism for stream items, a list
of names and feature values may be associated with an item. Feature
names are strings while values may be integers, floats or strings.
The class EST_Val
is a general class that will assign and
convert between ints, floats and strings as required.
The EST_Stream class is used to represent lists of linguistic objects, such as phones, syllables or words. It is a based on a speech synthesis architecture developed by Paul Taylor and Alan Black at ATR (black94). This is a C++ implementation.
A EST_Stream object is essentially a list of EST_Stream_Items, with some additional functions for file i/o etc.
It has two additional fields stream_name and pos_name. stream_name is used for storing the type of the stream, e.g. "phoneme" or "syllable". pos_name is used for storing the name of a single special label. This reason for this is rather obscure but is useful when dealing with files where the labels represent binary features.
The headed file `include/EST_Stream.h' contains the full class definition.
The easiest way to access the information in a EST_Stream is to use a for loop iteration idiom similar to that used in the EST_TList class.
Declare an iteration variable as a pointer to EST_Stream_Item:
EST_Stream s; EST_Stream_Item *p; for (p = s.head(); p != 0; p = next(p)) cout << p->name();
Note that unlike the EST_TList
class, there is no data
encapsulation and that access it provided directly through the
pointer. This may be changed in later versions.
At present the Utterance class consists of a list of streams, some accessing functions and a load and save function.
Any number of streams can be accommodated within an Utterance
object. They must be initialised before they can be used, and this is
done with the create_stream()
member function. This function
takes the name of the stream to be created as an argument. Internally
the streams are kept as a list.
The header file `include/EST_Utterance.h' contains the full class definition.
The following member functions also exist:
stream(EST_String s)
searches through the stream list and
returns a referrence to the stream of name "s".
item(EST_String s, int a)
searches through the stream list and
returns a pointer to the unit in s whose address is "a".
ritem(EST_String s, int a)
searches through the stream list and
returns a referrence to the unit in s whcose address is "a".
load(EST_String filename)
loads an Utterance structure from
file.
save(EST_String filename)
saves an Utterance structure to
file.
The <<
operator prints the structure.
Utterance files are a slight variation on ESPS/xwaves label files. An example is given below.
separator ; nfields 1 # Word 1.6 26 example; a3; Syllable 3 4 5 ; Phoneme 7 8 9 10 11 12 13 14 # Syllable 1.0 26 S; a3; Word 3 ; Phoneme 7 8 9 1.3 26 S; a4; Word 3 ; Phoneme 10 11 1.5 26 S; a5; Word 3 ; Phoneme 12 13 14 # Phoneme 0.8 26 e; a7; Syllable 3 ; Word 3 0.9 26 g; a8; Syllable 3 ; Word 3 1.0 26 z; a9; Syllable 3 ; Word 3 1.1 26 a; a10; Syllable 4 ; Word 3 1.2 26 m; a11; Syllable 4 ; Word 3 1.3 26 p; a12; Syllable 5 ; Word 3 1.4 26 e; a13; Syllable 5 ; Word 3 1.5 26 l; a14; Syllable 5 ; Word 3
Every time a "#" occurs, a new stream is started, named by the string after the "#". Each unit is on a spearate line. The first field is its end point in seconds. The second field is its colour and is irrelevant (all queries to Entropic about this!). The third field is the name of the unit. After the name is a colon indicating a separate xlabel display field. The first field afterthe colon is a number indicating the unit's address. These need not be sequential or start at zero - they only need be unique in the stream. The next field is a relations. Each relations field starts with the name of the stream it is referring to and has a list of the addresses that the unit is linked to. There can be arbitrarily many relations, each containing a semi-colon to start, a name and a list of addresses.
The EST_Ngrammar class provides N-gram language models of various
types. There is no built-in limit on N, rather it is limited by how much
memory you have available. Vocabulary items are internally indexed by
int
s, which allows large vocabularies. Executable programs for
building and testing N-gram language models are described above.
The first N-1 items (predictors) in the N-gram are used to predict the probability distrubution of the N'th item (predictee). By default, predictor(s) and predictee are taken from the same vocabulary, but this need not be the case (except for backed-off models).
The class constructors take as one of their arguments the internal
representation to be used. This is of type
EST_Ngrammar::representation_t
The most basic representation, where all possible N-grams are stored. Internally, this is a simple array of values, indexed by a value computed from the N-gram predictor/predictee indices.
A more effecient alternative to the dense representation, where only N-grams with non-zero frequencies are stored. Additional overheads mean that the sparse representation is only more efficient if enough N-grams have zero frequencies. The tradeoff point is left to the user to determine. Internally, the representation is a tree whose root is the first item in the N-gram. At this time, the sparse representation is not fully working.
This form is used for backed-off N-gram models (katz87) Internally, the representation is a tree whose root is the most recent item in the N-gram. Backoff weights are stored in the same tree. This representation limits the predictor and predictee vocabularies to be the same.
Threshold
The threshold for including an N-gram (its minimum frequency) is settable, but is the same for all orders.
Discounting methods
At this time, only an ad-hoc Good-Turing based method of computing discounts is available. This method involves computing a Good-Turing smoothed frequency-of-frequencies distribution. The discount for each frequency is then given by the difference between the smoothed and unsmoothed values, with the zeroton (frequency = 0) frequency being taken directly from the smoothed F-of-F distribution. Limits on the maximum frequency for Good-Turing smoothing, and a limit for discounting can be set. Alternative methods from the literature will be added in future releases.
Limitations
Unfortunately, backed-off Ngrammars can only be saved as such in ARPA format files (see section Ngrammar file formats) - the only standard "defined" (we use the term loosely) - but ARPA files cannot, at this time, be read. Saving in CSTR format involves expansion to a dense representation See section Ngrammar data formats
The member function accumulate
is used to build N-gram models
from data, but the higher level member function build
is more
useful.
To deal with start and end of data/sentence, the tags 'prev_prev', 'prev' and 'next' can be given. These are used to fill up the sliding window when building from 'sentence_per_line' or 'sentence_per_file' data (see section Ngrammar data formats). For example (the tags take their default values here):
N = 2 Sentence is : Hello world Mode is : sentence per line Using default tags Ngrams accumulated are : (!ENTER Hello), (Hello world), (world !EXIT)
Data can be in one of the following formats: ngram_per_line, sentence_per_line or sentence_per_file.
Dense and sparse Ngrammars can be saved and loaded in CSTR's own format (either ascii or binary). Backoff grammars can be saved in ARPA format, or in CSTR format which involves conversion to dense format. Compressed output is available via `gzip` (the GNU zip program).
Header
The header starts with the magic number "Ngram_2
" follwoed by the
order (N). The next two lines give predictor and predictee
vocabularies. For example (predictor and predictee vocabularies are the
same):
Ngram_2 1 acknowledge align check !ENTER !EXIT acknowledge align check !ENTER !EXIT
Data
Each line contains an ngram followed by its frequency (may be non-integer). For example:
acknowledge align : 33 check clarify : 72
Zero frequency ngrams are not stored. White space may be any number of spaces or tabs. Blank lines are ignored.
Header
The header starts with a magic number defined by
EST_NGRAMBIN_MAGIC
in the `EST_Ngrammar.h' header file,
followed by mBin_2
, then the order (N).
Data
Frequencies are written in binary (floating point) form, and since only dense format is supported, the ngram "words" need not be written. Simple run-length encoding is used to reduce file size, where repeated values are written as "value, -n" where n is the number of repetitions. This is possible because the values (frequencies) themselves are never negative.
The only "standard" format defined for backed-off grammars. We know of no written definition of this format, and hence do not attempt to give one here!
@ignore
This file documents the speech tools library developed at the Centre for Speech Technology Research at the University of Edinburgh.
Copyright (C) 1996 University of Edinburgh
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
This part describes base classes in the system.
TList is a generic template doubly-linked list class. See `include/EST_TList.h'.
The list is made up of a a series of "items" of class Titem. Each of
these has a value, val
and a next and previous pointer. At
present, the list uses a pointer to TBI to interate through the list
class. The best way to iterate through the list is to use <tt>for</tt>
loop style syntax.
See `include/EST_TList.h' for list,
EST_TList
includes most of what you'd expect from a list:
length ()
. Return number of items in list.
clear ()
. Removes all items from list.
append (const T &i)
. Appends the item "i" of type "T" onto end of
list. This is the standard way of creating a list.
EST_TBI *head ()
. Gives a pointer to the first item in the
list.
EST_TBI *tail ()
. Gives a pointer to the last item in the list.
index(EST_TBI *i)
. Gives the distance from the start of
the list to the occurance of item "i".
remove(EST_TBI *i)
. Removes item "i" from list.
insert_after(EST_TBI *pos, T &i)
. Inserts item "i" after
item "pos".
insert_before(EST_TBI *pos, T &i)
. Inserts item "i" before
item "pos".
In addition, the following overloaded operators are provided:
=
. Makes a full copy of the list and all its members.
+=
. Adds a list to the end of an existing list.
<<
. Print list.
()
. This is used to access the list (see below).
Examples of usage:
EST_TList l<int>; // declare a list "l" of integers; EST_TBI *p; // declare an interation pointer for this list. for (i = 0; i < 10; ++i) // fill with some values. l.append(i); for (p = l.head(); p != 0; p = next(p)) // iterate through list. cout << l(p); // access the item vai () overloading cout << l.length(); // print out length of list (in this case 10).
EST_TList
C++ does not have a standard for template instansiation which makes it
difficult to arbitrarily define new template types. Within the
speech_tools library new EST_TList
template classes should be
defined as follows. Suppose you have a class called Thing
and you wish to make a EST_TList
of it. Add to your file
the following
#if defined(INSTANTIATE_TEMPLATES) #include "../base_class/EST_TList.cc" template class EST_TList<Thing>; #endif
And add the name of that file to the make
variable TSRCS
in your `Makeilfe'
We hope this class has become stable though some more member
functions may be added (e.g. sort
etc.).
KVL (short for Key/Value List) is a template class of a list of
key/value pairs. There are two specifiable types for the KVL class,
the key and the value, e.g. KVL<EST_String,EST_String>
produces a list of string pairs. KVL uses the TList class to actually
store its data.
KVL
has much the same functionality as EST_TList
, but
has some additional features which make use easier. A crucial
difference between a KVL
and a normal list is that each key
in the KVL
is unique.
`include/KVL.h'
KVL
val (K key)
This is the basic way of reading from the
KVL. You give this function a key of type K and its returns a value.
val (EST_TBI *p)
You can iterate through the KVL
in the same way as you do for TList. This function returns a value given
the iteration pointer.
val_def (K key, V def)
Return "def" if "key" is not found.
key (EST_TBI *p)
returns the key for a given EST_TBI
pointer.
present (K k)
returns 1 if key "k" is in the
KVL
, 0 otherwise. Useful for seeing whether a key has been
defined.
add_item (K k, V v, int ns).
Add KV pair k,v to KVL
. By default, the list is serched
everytime for an occurance of k and if k is already defined, its value
is overwritten. However, if "ns" is set to 1, no searching of the list
is performed, and the key value pair is merely appended.
change_item (K k, V v, int ns)
Overwrites existing value of
k in KVL
with v. If k isn't in the list,
change_item (EST_TBI *p, V v)
Overwrites existing value in
KVL
, accessed by pointer.
In addition, the following overloaded operators are provided:
=
. Makes a full copy of the KVL
and all its members.
+=
. Adds a KVL to the end of an existing KVL
.
<<
. Print list.
Examples of usage:
KVL<int, int> x; // declare key value list. EST_TBI *p; // declare iteration pointer. // read some values from standard input. while(cin) { cout << "type key then val\n"; cin >> k; cin >> v; x.add_item(k, v); } // is vkey "9" in list? cout << (x.present(9) ? "true" : "false") << endl; for (p = x.list.head(); p != 0; p = next(p)) // iterate through list. cout << x.key(p) << " " << x.val(p); //print out all keys and values in list.
The EST_Option class provides a uniform way to access options in
a program. The most obvious source of options are from the command
line. The function parse_command_line2(...)
takes the C command
line variables (argc
,argv
) and produces an
EST_Option
class. Specifically it allows options names value
tpyes, defaults and documentation for opntions in a program.
The EST_Options
class is derived from KVL<EST_String,
EST_String>
, so all the KVL
member functions also work with
EST_Option
. It provides some useful extra functionality.
EST_Option
All the options are stored as key value pairs of
EST_String
s. However, it if often useful to have other types,
e.g. integers. This is possible, but it is the EST_String
of the
integer that is actually stored. Additional member functions,
e.g. add_item()
do the conversion automatically.
The member functions ival(const EST_String &key)
will
return the value as an ineger and fval(const EST_String &key)
as a float.
It is sometimes convenient to store options in files, and the options class supports a system where there is one.
There is one key/value pair per line. Lines can be commented by starting
them with the comment character (by default this is ";", but this can be
set using the load()
function). Each line must start with the
key. The remainder, which may appear as a list in the file, is taken as
the value. Option files can be included in other option files by using
the #include filename
directive.
If a particular key appears more than once when loading, the value of
the last occurance is used. Files are loaded using the load(...)
function. The first argument to this is the file name, and the second
(optional) argument is the comment character. The load function merely
appends to the existing options (while overriding the values of
duplicate keys) - if an entirely new set of options are to be loadedcall
the clear()
member function first.
The EST_Option
class inherits the member functions of the
KVL
class. In addition, the following exist:
add_prefix(EST_String p)
Adds a prefix "p" to all keys in the
list. @item remove_prefix(EST_String p)
removes a prefix "p" from all
keys in the list.
override_val(EST_String key, EST_String val).
If val is not empty, add
keyval pair to option list.
override_ival (EST_String key, int val)
If val is not 0, add keyval pair to option list.
override_fval (EST_String key, float val)
If val is not 0.0, add keyval pair to option list.
ival(EST_String rkey)
return value of key, cast to an integer.
fval(EST_String rkey)
return value of key, cast to an float.
dval(EST_String rkey)
return value of key, cast to an double.
add_iitem(EST_String key, int val)
add integer value to list.
add_fitem(EST_String key, float val)
add float value to list.
A simple vector class is provided for. Member functions are given in `include/EST_TVector.h'.
The EST_TMatrix
class allows the creation of standard matrices.
See `include/EST_TMatrix.h' for member functions.
There is a derive class EST_FMatrix
for floats, the derivation
is used rather than a simple template to allow loading and saving to
files.
The EST_Chunk
classes offers a reference counting system
for arbitrary segments of memory. This is primarily used by
the EST_String
class.
This class was written for a number of reasons. It offers a string
class functionally identical to the GNU libg++ String class. We choose
to write our own string class rather than use the one provided with GNU
G++ for the following reasons. The String
class in libg++ is
different in different versions and causes lots of confusion when
compiling the system with different versions of `libg++'. If we
depended of the GNU String
class we must provide `libg++' on
all platforms we compile the system on. This and the Regex
class
are the only classes we relied on, by writing our one we all much
greater portability. The GNU String
class typcially copies
string values around while our replacement uses reference counts.
Because of the way we use strings in the speech tools and Festival
keeping track of reference counting allows a much more efficient
implementation of strings. Thus our replacement string class is faster
for substantial benchmarks of Festival than the GNU equivalent.
The member functions of EST_String
follow that of the GNU
`libg++' String
class as closely as possible (we designed it
for a drop in replacement of our current use of String
).
As we wished to remove our dependence on GNU libg++ as described in the previosu section we have provided a regular expression class which for the most part follows that of the GNU libg++ Regex class. This implementation uses the regex functions from BSD4.4-lite (and earlier) written by Henry Spencer.
A class which relates names (EST_String) to enums.
EST_StringTrie
builds a tree index from string keys to arbiitrary
objects. Thus objects may be index effciently from strings. The
strings must be ascii (the eighth bit is ignored).
For example the following builds an index of regular expressions based on their character form so that they need not be recompiled.
EST_StringTrie regexes; EST_Regex *make_regex(const char *r) { // Access previously generated Regex or make new one // and add to index EST_Regex *rx; if ((rx = (EST_Regex *)regexes.lookup(r)) == 0) { EST_Regex *nr = new EST_Regex(r); regexes.add(r,(void *)nr); rx = nr; } return rx; }
A StringTrie may be explicited clear with the function clear()
.
The contents of the string tree make cleared by passing a
garabage collection function with clear()
which will be
poassed each item in the trie as a void *
. The type
of the user provided garbage collection function is
void (*deletecontents)(void *n);
EST_Token
with EST_TokenStream
provides a method for
reading files as whitespece separated tokens. A token consists of four
parts, some of which may be empty: a name, the actual token, preceding
whitespace, preceding punctuation, the name and succeeding punctuation.
The definitions of whitespace and punctuation are user definable. There
is also support for single character symbols and quoted tokens.
A token stream from which tokens may be gotten, may be a file or a string.
For example the follow reads a file and output each toke on a new line
EST_TokenStream ts; EST_Token; ts.open("myfile"); while (!ts.eof()) cout << ts.get() << endl;
Although token streams support on symbol look ahead via peek()
they do support unget.
Punctuation (pre and post) may be set after opening a stream. The defaults are empty. Any characters defined as punctuation found around a token are striped and saved in the punctuation fields. Single character symbols will cause a token break when ever they occur (i.e. separating whitespace is not required), again by default these are empty. Whitespace by default is defined as space, horizontal tab, carriage return and line feed.
Quoting mode is off by default but may be started by calling
set_quotes
with a quote character and an escape character
(typically " and \). When in quote mode, a token starting
with the quote character will continue until next unescaped quote
character, including whitespace and punctuation.
Although a whole file's contents including all its whitespace may be recorded by tokens from a token stream, any final whitespace after the last real token may be mistakenly omitted unless care is taken. In many cases you'll just require the final whitespace before end of file to set end of file which is the default. In quotes mode all tokens include this last token with an empty name will be returned before eof is set.
Go to the first, previous, next, last section, table of contents.