Go to the first, previous, next, last section, table of contents.


Tools

Various general purpose subsystems exist within CHATR that allow the creation of new (and parameterizing of existing) modules.

Feature Functions

How can you get specific data from an utterance in a uniform way? The answer is most likely through a feature function. A feature function takes a stream cell and returns a value (a character string). This mechanism is designed so vectors of parameters can easily and uniformly be extracted from an utterance. For example, suppose you want the vowel type, lexical stress, accent and duration of all syllables. This is the mechanism you should use to do it.

From Lisp you can get sets of vectors for items in a stream within an utterance. This mechanism should be used to extract data to use within external training systems--or possibly one of the training sub-systems described in the following section. The general Lisp function to use is Feats_Out. It takes three or four arguments: an utterance, a stream name, a list of features to extract, and optionally, a filename to write them to. For example

     (Feats_Out utt1 'Syl '(syl_vowel stress Syl_tobi_accent Syl_dur))

This will return a list of values for each syllable in `utt1'.

A large number of feature functions are already defined in file `src/chatr/feats.c'. The names and a short description can be listed from lisp with the command Feats_Out, with no arguments.

This function is especially useful in dumping features from PhonoForm utterances. See section PhonoForm Utterance Types, for a full description.

Note some of these functions are very specific to particular types of utterances, and are not all general functions. The core has already been used within the system for various decision trees, duration and intonation systems using linear regression and neural nets.

If you are writing new functions they should be added to the table in `feats.c'. Care should be taken so they do not generate garbage (i.e. memory leak). This is not usually a problem, but if numbers are to be returned, they must be converted to strings and hence may not be garbage collected. In this rare case, copy the technique used in Syl_dur, where the new string is added to a list of strings to be freed later. Note that returning numbers is actually quite rare. In most occasions, quantized versions are quite adequate. Do try to avoid memory leaks, they can cause much trouble.

Linear Regression

Linear regression is used in a number of different places within CHATR, such as intonation, duration, as well as weight training for unit selection. It is also available through a Lisp interface so models can be trained within CHATR. Normally, full experiments would be done outside CHATR in something like S-plus or MATHLAB, but once a model is decided on, it is easier to train within CHATR, especially as data may be much more readily available within the system than externally.

Two Lisp functions provide the interface, one for training and one for testing. Data for training and testing will normally be calculated via the feature functions described above. See section Training a ToBI-Based F0 Prediction Model, for an example of model building using linear regression.

First, collect vectors into a file. A vector consists of a value (to be predicted), and a set of feature values. These should be put in a file, one vector per line, with parenthesis at the start and end. A vector should be readable in one lisp read.

Given such a vector file, and a list of feature names (including a name for the first item in a vector list), use the command

     (Linear_Regression_Detail vector_file features_names '2)

The results consist of a list of details about various aspects of the linear regression model built. The third parameter controls the amount of information that is required by the call. The following values are valid

0
Returns a list of floats, the intercept plus the weights for each feature.
1
Returns a list of items: a list of pairs consisting of feature name plus weight; the intercept; the percentage variance described by the model; the correlation; and the correlation of each individual feature with the dependent variable.
2
As type 1 plus: the weight times standard deviation for each feature (shows their relative contributions); a list of dropped features, i.e. those with weight 0; the contribution of each feature; the stepwise contribution giving the order (actually an order) of importance of each feature.

It is recommended that the results from linear regression be checked to find out which features are contributing the most, and also which give little contribution or none at all (i.e. they are fully predictable from other features).

A function called Linear_Regression_Predict allows the testing of models, typically against a test set. Like the Linear_Regression_Detail described above, this requires a vector file in the same format. Each vector consists of a value for prediction (the dependent variable), and the features used to predict it. Vectors must have parentheses around start and ends. A `MODEL' is required, consisting of a list of floats, the first being the intercept, and the rest the weights for each feature (i.e. as is returned by a level 0 linear regression or extracted from the more detailed results). An optional third argument to Linear_Regression_Predict specifies a file that predicted values are written to, so external tests may be carried out. The correlation and statistics for mean error are printed to the screen.

A representation of a linear regression model structure is provided internally, with conversion functions from Lisp, and use within a model.

Before using any of the following functions, you must include

     #include "lr.h"

Given a Lisp list of pairs of feature name plus weight, plus a pair named Intercept. An optional third value may exist in the misnamed `pair', this should be a feature map name. Feature maps are defined in the variable feature_map and consist of a name and a list of values. If a feature returns a value in a feature map's list then its value is treated as 1, else it is assumed to be 0. This is designed to be able to efficiently deal with category valued features. An example would best illustrate this. Supposing the we have a feature that returns the ToBI accent label on syllable. Things like H* and L* are of course not valid numbers and hence cannot be used in a linear regression model. But if we use this feature with a feature map we can use it in an LR model. Suppose we have the following features maps

     (set feature_maps
       '((tobi_accent_0 H*)
         (tobi_accent_1 !H*)
         (tobi_accent_2 L*)
         (tobi_accent_3 L+H* L+!H* H+!H* L*+!H L*+H)))

We can then in our linear regression model specify pairs as

     ...
     (tobi_accent 13.3743 tobi_accent_0)
     (tobi_accent 12.5658 tobi_accent_1)
     (tobi_accent -17.0276 tobi_accent_2)
     (tobi_accent 5.6093 tobi_accent_3)
     ...

In C we can convert such a Lisp representation of a linear regression model to an internal more efficient one and then use it. This short example illustrates the use.

     List lisp_lr_model;
     LR_Model lr_model;
     Stream s;

     lisp_lr_model = list_str_eval("lr_model","no LR model set");
     lr_model = make_lr_model(lisp_lr_model);

     for (s=utt_stream("Segment",utt); s != SNIL; s=SC_next(s))
	    SC(s,Segment)->duration = lr_predict(s,lr_model);

Neural Nets

CHATR includes a subsystem for training, testing and using neural nets. Although it offers some flexibility, it has been specifically designed to implement Campbell's duration theory. Hence the basic structure is fixed. However, a choice in number of inputs, hidden nodes and outputs is available. Each net consists of an input layer, a hidden layer and an output layer. All input nodes are connected to all hidden nodes and all hidden nodes are connected to all output nodes. The nets are used to produce a single value between 0 and 1 (which we understand is a slightly unusual use of neural nets).

Input values must be single digits between 0 and 9. If all inputs for all training data are 0 or 1, the inputs are treated as binary and are rescaled. The actual inputs (and outputs) are rescaled to be greater than 0 but less than 1, however the scaling factors are internal to the implementation, so users need not worry about them. The outputs are linearly scaled from the examples in the training set by the function

O{net} = (O{actual}-Min{out}+epsilon)/(Max{out}-Min{out}+epsilon)

Some other mapping function may be considered to be more appropriate.

A training set should consist of a list of pairs of inputs (a string of digits) and an output value (a float). For example

     7710000063600011000110026401    70.0
     7101010636300111001110264411    270.0
     1010116363301111011112644410    220.0
     0104103633311111111116444401    190.0
     1041016333011110111114444412    400.0
     0410113330011100111104444221    310.0
     4100113300011000111014442412    230.0
     1000103000310000110104424821    140.0
     0000000003300000101004248811    190.0
     0001000033000001010002488210    150.0

The training set is best stored in a single file. The training function takes the following arguments

     (NN_Train PAIRS_LIST/INFILE OUTFILE ITERATIONS)

The second argument may be a Lisp list of input/output pairs or a file name that contains an (unbracketed) list as above. (The non-file option was really added for very simple tests and debugging of the system.) The OUTFILE option names a file where a Lisp representation of the neural net will be saved at the end of training (and at check points). That representation is suitable as input to the other neural network functions. ITERATIONS is the number of iterations this training session should perform.

Other parameters for the training may be specified in the variable nn_params. If set, this should consist of a list of pairs, each pair consisting of a parameter name and a value. The supported parameters are

n_hidden
Value is a number specifying the number of hidden nodes this net should have.
check_pt
Value is a number specifying the number of iterations between check points.
check_actions
value is a list of actions to do at a check point. Three actions are available
save
Save the current net in the output file.
error
Print the current mean error.
list
Print one cycle of input/output pairs (probably too much for most users).
check_pt_func
Value is a Lisp function. This function is evaluated at check points--useful for doing arbitrary tests. It is recommended to set this to run NN_Test on the training and test sets.
start_net
Value is a Lisp representation of a net as saved in OUTFILE at the end of a training session or at a check point. This net is used as the starting point so training sessions can continue after being stopped. The net representation includes details of the number of iterations it has already executed. The ITERATIONS argument to NN_Train is the number of iterations required this time rather than with respect to the number of iterations already made with the starting net.

A typical training session may be started as in the following example

     (define ccc ()
       (NN_Load (load "rad_sdur9"))
       (print (NN_Test "rad_sdur5.netdata"))
       (print (NN_Test "rad_sdur5.test")))

     (set nn_params '((n_hidden 20) (check_pt 50) 
                      (check_pt_action save error)
                      (check_pt_func ccc)))

     (NN_Train "rad_sdur5.netdata" "rad_sdur9" '40000)

When continuing an interrupted training session, a typical restart command set is

     (define ccc ()
       (NN_Load (load "rad_sdur9"))
       (print (NN_Test "rad_sdur5.netdata"))
       (print (NN_Test "rad_sdur5.test")))

     (set nn_params '((n_hidden 20) (check_pt 50) 
                      (check_pt_action save error)
                      (check_pt_func ccc)))
     (set nn_params (cons (list 'start_net (load "rad_sdur8"))
                    nn_params))

     (NN_Train "rad_sdur5.netdata" "rad_sdur9" '40000)

The function NN_Load takes one argument, a Lisp representation of a net (as saved by a training session). It stores it as the current net--though this notion of current net is only used in testing and training and not when these nets are actually used in the duration model.

The function NN_Test takes as an argument a list of input/output pairs or a file name containing such a list. (i.e. the same form as the first argument to NN_Train.) It tests that list with respect to the current net, and returns a list of three values; the mean error, the RMS error, and the standard deviation of the error.

Decision Trees

A number of sub-systems within CHATR use decision trees (sometimes called discrimination trees). They are all of the following format

     dtree:           condition-node | leaf-node 
     condition-node:  (condition true-node false-node)
     true-node:       dtree
     false-node:      dtree
     condition:       (featname binary-operator value) |
                      (featname in value value value ...)
     binary-operator: < > = <= >= 
     value:           number real or string
     leaf-node:       list whose car is atomic

featname should be a feature function as defined in `src/feats.c'. The actual leaf node may be anything, a probability distribution, a particular class, a linear regression model, or whatever is appropriate. They are used with respect to a particular stream cell. The feature name in a condition is called for that stream cell and the result tested against the condition. If the condition is true, the function recurses and applies the true node to the stream. If the condition fails, it recurses and applied the false node to the stream cell. When a leaf node is found it is returned. As an example, a tree for predicting reduction on syllables is

     (set reduce_tree '
        ((foot in 8 9 5 7 6  ) 
            (  0.9361 0.007192 0.05675 unreduced)
            ((accented = 0)
             ((pbi < 3.5) 
                 (  0.2895 0.6295 0.081 reduced)
                 ((bi < 0.5) 
                     (  0.3242 0.4341 0.2418 reduced)
                     (  0.647 0.3406 0.01244 unreduced)))
             ((pbi < 0.5) 
                 (  0.1538 0.7692 0.07692 reduced)
                 (  0.8138 0.1517 0.03448 unreduced)))))

A decision tree may be easily used within C code. Note the decision is always made with respect to a Stream cell, in this case a syllable. The feature functions named should be appropriate for the stream cell type given. Refer to the following example

     #include "disctree.h"

     tree = list_str_eval("reduce_tree","No reduce tree");
     for (s=utt_stream("Syl",utt); s != SNIL; s=SC_next(s))
    {
        class = dt_decide(s,tree);  /* returns leaf node */
        if (list_sequal("reduce",list_last(class)))
           reduce_syl(s);
    }

Decision trees may be created externally from any CART-like system and converted to the above Lisp format and then used easily within CHATR


Go to the first, previous, next, last section, table of contents.