This section shows how to read in an existing set of commands for a backprop program and then use the menus to run the training. Using the menus and buttons is obviously easier most of the time however some people will prefer typing the commands once they know them.
Normally you will have the commands you need for a problem in a short file, this file can be created by typing in commands to an editor or you can go straight into the program, make the network, select parameters, give the pattern files and then let the program save all this to a file that you use next time. The training data should be ready in one file and if there is test set data it should be ready in another file. The following example uses the command file xor.bp and the data file xor.dat both included with the package. The xor.bp file looks like:
* input file for the xor problem m 2 1 1 x * make a 2-1-1 network with extra input-output connections s 7 * seed the random number function ci * clear and initialize the network with random weights rt xor.dat * read training patterns into memory from xor.dat e 0.5 * set eta, the learning rate to 0.5 (and eta2 to 0.5) a 0.9 * set alpha, the momentum to 0.9
The data file xor.dat looks like:
1 0 1 0 0 0 0 1 1 1 1 0
One way to start the Windows program is to type in a command to run the program and load any command files afterward. The other way is to include the command file on the command line so to do the xor problem defined in xor.bp enter:
The following should then show up at the bottom of the client area of the bp program:
i xor.bp seed = 7 range = -1.00 to +1.00 4 training patterns read
The menu bar has a number of different types of commands, there is the usual File entry, a GUI entry for setting things like colors and fonts followed by a series of one letter labels on buttons, the letters and their meaning are:
A: Algorithm Parameters and Tolerance D: Delta-Bar-Delta Parameters F: Formats G: Gradient Descent (plain backprop) I: Input Commands, Change Input Formats N: Making a Network, Listing Network Values O: Output Commands, Change Output Formats P: Reading Patterns, Evaluating Patterns Q: Quickprop Parameters T: Training Commands W: Weights CommandsFinally there is a series of buttons that are shortcuts for some of the most commonly used commands.
To train the network so that it learns the xor problem, select T. In the T window you will find a line that reads:
Run 100 iterations and print every 10If you now click the Run button the training will start and the T window will be destroyed. The program also writes "r" at the bottom of the canvas area, "r" is the command you could type in to the entry box at the bottom of the main window. The "r" on the second row of the menu bar also stands for run. Following the "r" line in the canvas area you get:
running . . . 10 0.00 % 0.49947 20 0.00 % 0.49798 30 0.00 % 0.48713 40 0.00 % 0.37061 50 0.00 % 0.15681 59 100.00 % 0.07121 DONEThe program immediately prints out the "running . . ." message. After each 10 iterations a summary of the learning process is printed giving the percentage of patterns that are right and the average value of the absolute values of the errors of the output units. The program stops when the each output for each pattern has been learned to within the required tolerance, in this case the default value of 0.1.
There are many factors that affect the number of iterations needed for a network to converge. For instance if your random number function doesn't generate the same values as the one from gcc the number of iterations it takes will be different. To see this you can go to the T menu window and type in a new value for seed followed by a carriage return. Then select the clear and initialize button, then you could select the "Exit" button at the bottom of the window. If you now select "r" on the menu bar you get a new set of answers. An even simpler way to do this is to let the program come up with a new random number and initialize the weights. To do this select the "sci" button on the second line of the menu bar, this gives:
s ci seed = 42, range = - 1.00 to + 1.00Or you could type in "s" in the entry box ("s" is for seed) followed by a carriage return and the program will come up with a semi-random seed value. Or you could make up your own random number and type it in:
s 75If you're typing you need to type in "ci" for "clear and initialize" otherwise now just click "r" on the menu bar you get:
r running . . . 10 0.00 % 0.49999 20 0.00 % 0.50000 30 0.00 % 0.49999 40 0.00 % 0.49999 50 0.00 % 0.49993 60 0.00 % 0.49972 70 0.00 % 0.49777 80 0.00 % 0.47344 90 0.00 % 0.30676 100 25.00 % 0.12148Unfortunately this training session went slower than the other one so click "r" again giving:
r running . . . 106 100.00 % 0.07263 DONEThe initial weights that the seed 7 generated could also be written to a weights file (click "sw" for save weights on the menu bar) where you can read them in again (click "rw" for read weights). The weights for the 7 seed example were:
0r m 2 1 1 x aahs aos bh 1.000000 bo 1.000000 Dh 1.000000 Do 1.000000 file = ../xor.bp 2.992554e-001 1 1 1 to 2 1 -6.919556e-001 1 1 2 to 2 1 1.026917e-001 2 2 b to 2 1 9.701538e-001 1 1 1 to 3 1 -3.330383e-001 1 1 2 to 3 1 -7.740173e-001 1 2 1 to 3 1 7.896118e-001 2 3 b to 3 1and there is a copy of these values in the file xor.ini that came with the package. You can read these in by going to the W (Weights) menu window and clicking the button that lists the files and then double clicking the file name xor.ini. If you train with these weights the odds are overwhelming that you will get exactly the results shown above however in larger problems other factors like the machine architecture, type of the weights (float or double) or optimizations by the compiler will produce slight differences in the results.
In the weight file the first column gives the initial weight values, the second column gives codes for weights being used or not used (positive values are used), the remainder of the line identifies the weight in question, the first weight is the one that runs from layer 1 unit 1 to layer 2 unit 1. A `b' for the unit number indicates the bias unit.
Backprop never produces the exact result for any of its training patterns it will just get close to them, hopefully close enough to be of some use. In the xor example the desired answers are 0 and 1 and the program stopped when each output pattern got to within 0.1 of these target values. To see exactly what it got for each pattern you can go to the second row of the menu bar and click the "p" (for patterns) button and you get the following for the 7 seed value:
>p 1 0.903 e 0.097 ok 2 0.050 e 0.050 ok 3 0.935 e 0.065 ok 4 0.072 e 0.072 ok 59 (TOL) 100.00 % (4 right 0 wrong) 0.07121 err/unit
The first column is the pattern number, the second column is the actual value of the output, the numbers in parentheses give the sum of the absolute values of the output errors for each pattern. An `ok' is given to every pattern that has been learned to within the required tolerance. Another way to list the patterns is to go to the P (for Patterns) menu window and select the right button and of course you could always type "p" in the main window's entry box.
To get the status of a single pattern, say, the fourth pattern, you have to go to the P menu window, then in the line with the label "Print Training Set" click the "One" button. An entry box comes up where you type in 4 followed by a carriage return, giving the following result on the screen:
p4 4 0.072 e 0.072 ok
To get a summary without the complete listing select "Summary" button from the "Print Training Patterns" line in the P menu window. Likewise to get the target value for a particular pattern in the training set select the "Target" button.
To get the network to compute a value for some other input values you can simply type the values in the entry box at the bottom and when the program finds numbers it interprets this as a command to evaluate the network with these numbers. You can also go to the P (Patterns) menu window and click the "Type in and Evaluate a Pattern" button. Normally inputs to the xor problem are only 0 and 1 but the network will compute an output for any pair of real values, say (0.5 0.5). Typing this in to the entry box gives:
0.5 0.5 0.060
There is generally no good reason to see the weights however people sometimes want to see them anyway. To see a listing of all the weights you need to save the weights to a file and then look at the file. To see the weights that lead into a single node click the "List Weights into a Node". An entry box comes up and asks for the layer number where and then another entry box comes up and asks for the unit. The listing you get looks like:
>w 3 1 layer unit unit value in use weight in use input from unit 1 1 1.00000 1 5.38258 1 5.38258 1 2 1.00000 1 -4.86238 1 -4.86238 2 1 0.99245 1 -10.86713 1 -10.78510 3 b 1.00000 1 7.71563 2 7.71563 sum = -2.54928
This listing also gives data on how the current activation value of the node is computed using the weights and the activations values of the nodes feeding into unit 1 of layer 3. This is for the last pattern the network saw, if you're interested in one particular pattern then have the network evaluate that pattern before asking about the weights. In the listing the `b' unit is the bias (also called the threshold) unit. The listing also gives the result of multiplying input unit and the weight in the last column and below the last column is the sum of the inputs.
To end the program you can go to File menu entry and select "Save and Exit" or "Quit, No Save". In this example there is nothing worth saving so "Quit, No Save" is appropriate. When you select "Save and Exit" the program asks for a file name to save the parameters to, this can be a simple name like "saved" or using an extension like .bp is often worthwhile. Note that the file created is jam-packed with ALL the parameter values the program has not just the few values that the original xor.bp file had.