This documentation covers version 1.0.1 of the Edinburgh Speech Tools. At the present time, the main reason for making the speech tools available is for use in other pieces of software such as Festival. The Speech Tools have greatly improved from its previous version and contains code that is of use in its own right.
Although you are free to use this code in any manner compatible with the licence, please note that the speech tools are likely to undergo comprehensive revision in the future and importantly we do not make any guarantees about backwards compatibility. Thus if you write a program using the track class for instance, it may not be compatible with a future release. However we will, wherever possible, continue to provide similar functionality in translations to new versions of the speech tools should always be possible.
In addition, we warn that while several programs and routines are quite mature, others, in particular the signal processing and statistics are young and have not be rigorously tested. Please do not assume these programs work.
In order to compile and install the Edinburgh Speech Tools you need the following
We have successfully compiled and tested the speech tools on the following systems:
As stated before C++ compilers are not standard and it is non-trivial to find the correct dialect which compiles under all. We recommend the use of GCC 2.7.2 if you can use it, it is the most likely one to work.
Previous versions of the system have successfully compiled under SGI IRIX 5.3, and HPUX but at time of writing this we have not yet rechecked this version.
For our Windows NT and Windows 95 ports we use the Cygnus GNU win32 environment (b18) available from `ftp://ftp.cygnus.com/pub/gnu-win32/'.
Before installing the speech tools it is worth ensuring you have a fully installed and working version of your C++ compiler. Most of the problems people have had in installing the speech tools have been due to incomplete or bad compiler installation. It might be worth checking if the following program works, if you don't know if anyone has used your C++ installation before.
#include <iostream.h> int main (int argc, char **argv) { cout << "Hello world\n"; }
All compile-time configuration for the system is done through the user definable file `config/config_make_rules'. You must create this file before you can compile the library. An example is given in `config/config_make_rules-dist', copy it and change its permissions to give write access
cd config cp config_make_rules-dist config_make_rules chmod +w config_make_rules
You must now edit `config_make_rules' to make it reflect you local environment.
This involves selecting your `C++' compiler and indentifying various library directories. Please read the comments in the file for specific instructions.
Simple choices for common set ups are given near the top of this file. But for some sub-systems you will also need to change pathnames for external library support.
At present one external library may be used. If you wish, NCD's network audio system (formerly call netaudio) is supported.
NCD's NAS library offers network transparent access to audio hardware on a number of different platforms and is used extensively in CSTR. NAS is available from `ftp://ftp.x.org/contrib/audio/nas/' as well as being distributed in the contrib directory of X11R6.
If you wish to use it, uncomment the variable INCLUDE_NAS
.
And also check the valus of NADIR
further down the file.
Other options allow you to specify support of other audio devices. Note these are operating system and hardware dependent and you may only select them if you are compiling on that OS type and have the hardware.
The previously released version of the speech tools supported an alternative method for reading and writing Entropic's ESPS headered files. After testing, our own versions of functions to access these files seem adequate for all the types of file we wish. So we have removed the proprietary access methods (which required the ESPS libraries and a licence).
Once you have configured `config/config_make_rules' you can compile the system. First build the include dependencies
gnumake depend
This outputs what appears are errors about missing files, this is not a problem. Now build the library
gnumake
Note this must be GNU make, which may be called make
on
your system, or gmake
or gnumake
. This will compile all
library functions and all the executables. If you wish to only compile
the library itself then use
gnumake lib_build
Note that if you compile with -g
the library and the
corresponding binaries will be large. Particulary the executables, you
will need in order of 80 megabytes to compile the system, if your C++
libraries are not compiled as shared libraries. If you compile without
-g
the whole library directory is about 12 megabytes on Linux
(which has shared libraries for libstdc++
or about 26 megabytes
of Sparc Solaris (which does not have a shared library libstdc++
by default). This is almost entirely due to the size of the executables.
C++ does not make small binaries.
It should be possible to build a shared library for both the system C++ libraries and the speech tools library itself. This would substantially reduce the footprint of the executables. Shared libraries are different on every machine and we have not spent time investigating support of the different architectures that we support. We'll have to address this in later versions.
All executables are linked to from `speech_tools/bin' and you should add that to your PATH in order to use them.
Note that the previously widely released version of the speech tools
(0.96.1) required GNU's `libg++'. As we have written our own
largely compatible string class, called EST_String
, we no longer
require GNU's `libg++' but do still require GNU's `libstdc++'
which provides i/o streams. (In older versions of GCC you still require
to link with libg++
as there is no split between the GNU G++
library and the standard C++ library). Our string class uses references
counters rather than copying which makes it faster for the sort of task
we use it for.
Specifically all major classes are not prefixed with EST_
to reduced the possibility of clashing with symbols in other
libraries. Thus to update code, typically the main change
is to add this prefix onto the symbols String, Regex,
Wave, Utterance, Stream, KVL, TList, Ngrammar, Option,
Stream_Item, TMatrix, TVector, Token, and TokenStream
.
TFR
has been removed and its functionality integrated into
EST_Track
.
Note although we apologise for making such changes they were necessary for our library to be useful in the long run. Changes can be typically be made automatically (we did so for the speech tools and Festival). We hope not to have to change names so drastically again.
There are also a number of other member function name changes to make the system more logical. We know we have not yet been thorough enough to make the member functions fully logical but they are now much better. Thus on converting code there may be other minor changes required. If these are not immediately obvious to you please contact us for advice.
In the future we hope to provide the following:
Go to the first, previous, next, last section, table of contents.