Professional Basis of AI Backprop Hypertext Documentation

Copyright (c) 1990-97 by Donald R. Tveter

The Predict Command

Overview

The predict command (pr) is used for recurrent networks to predict new values in a time series. The program takes the current training file and writes out a temporary training set file called train.tmp and a temporary test set file called test.tmp. The program then reads them in and uses the benchmarking command to make one or more runs with the data sets. The program then writes a new train.tmp file with one more pattern and a new test.tmp file. Benchmarking is called again. The process continues until there are no more patterns that can be used in the test set. If you need to work with the original full training set file after using the predict command you will have to read it in again.

There are two basic ways of setting up the test set. One is called "Sin-like" because sin(x) is an excellent example of the process. In this procedure the output value of the network can be fed back in to the input layer and this can be done repeatedly. Errors in the network's output values will quickly cause the predictions to go bad so it does not pay to try to predict too far ahead, really one time step ahead is about all you should take seriously however the number of steps you try to predict ahead can be set by the user. The train.tmp file for recurrent sin(x) will look like:

   0.00000  H   0.15636
   0.15636  H   0.30887
   0.30887  H   0.45378
    ...
  -0.31189  H  -0.15950
  -0.15950  H   0.00000
If you want to try and predict 3 values ahead (not hard when you consider that the network will have memorized a complete cycle of sin(x)) then the test.tmp file will look like:

   o 1  H   0.15636
   o 1  H   0.30887
   o 1  H   0.45378
where the first input value is taken from the first of the output values. Some results in one test run gave:

test set size = 3
  0.122  0.210  0.295 for all 37 runs
Here, the three values represent the error on the three test set patterns. There are 37 runs because the program started out using 41 training set patterns and 3 test set patterns, then 42 training set patterns and 3 test set patterns, then 43 training set patterns and 3 test set patterns and so on until there were 79 training set patterns and 3 test set patterns. Another option lets you keep moving through the data with a fixed number of the most recent training set patterns. Runs where the network did not converge are not included in the results.

Besides "Sin-like" there is the "Predict One" option. In this case the output value is not fed back to the input layer. If the train.tmp file is:

   0.00000  H   0.15636
   0.15636  H   0.30887
   0.30887  H   0.45378
    ...
  -0.31189  H  -0.15950
  -0.15950  H   0.00000
The test.tmp file will look like:

   0.00000  H   0.15636
The output from one run looked like:

test set size = 1
  0.048 for all 39 runs

The result for the last benchmarking run can be found in the file bresults.tmp and all the results for all benchmarking runs can be found in the file results.tmp.

The Predict Menu Window Commands

Maximum Iterations

The maximum number of training iterations to run, this is the same parameter as in benchmarking so to set the maximum iterations to 500 you can type "b r 500 100" where this also sets the print rate to 100.

Print Rate

This parameter gives the rate at which to print the pattern status summary, note that it is also the same rate at which to sample the test set for a new minimum error. this is the same parameter as in benchmarking so to set the print rate to 10 you can type "b r 500 10" where this also sets the maximum iterations to 500.

Goal for Successes

This parameter gives the number of successful training runs to try for. It is the same as the benchmarking goal and the typed command to make the goal 6 is: "b g 6".

Maximum Tries

This parameter gives the number of tries to make in order to meet the goal for successes. It is the same as the maximum tries parameter in benchmarking. The typed command to make the maximum number of tries 12 is "b m 12".

Test Set Type

This is perhaps a bad label for the command that sets the type of the test set to either "Sin-like" or "Predict One". The typed command to get "Predict One" is "pr o 1" (where o is for option, another bad label). The typed command to get "Sin-like" is "pr o 2".

Initial Training Patterns

This is the initial number of patterns to put in the training set written to train.tmp. The typed command to set the number of patterns to 41 is "pr i 41".

Test Set Size

This is the size of the test set. It only makes sense to use a size greater than 1 if you're using the "Sin-like" option. The typed command to set the test set size to 3 is "pr s 3".

Window Width

This parameter determines whether or not to keep the training set constant or constantly increase it by one. The options are "Grow" or "Constant". The typed command to keep the training set constant is "pr w c" and to have it grow is "pr w g". In most cases using "Grow" is probably the best choice unless you have data whose properties change over time.