Professional Version Basis of AI Backprop Hypertext Documentation

Copyright (c) 1990-97 by Donald R. Tveter

Weights

This menu window deals with saving, reading and listing weights.

Weight Decay

One way to improve generalization is to use weight decay. In this procedure each weight in the network is decreased by a small factor at the end of each training cycle. If the weight is w and the weight decay term is 0.0001 then the decrease is given by:

   w = w - 0.0001 * w
Reasonable values to try for weight decay are 0.001 or less. There is one report that the best time to start weight decay is when the network reaches a minimum on the test set but it is difficult to decide when the network has reached a minimum. There is no automatic way of turning on weight decay at a certain point although its one of those things I should add. The typed command is:

a wd 0.0005
to set the weight decay factor to 0.0005.

Turn Weight On/Turn Weight Off

Each button will bring up a series of 4 entry boxes and in each box you have to type in either the layer number or the unit number for each node necessary to specify the weight you want to turn on or off. The order of the boxes is:

The typed version of the command to turn on the weight from layer 1 unit 2 to layer 3 unit 4 is:

onw 1 2 3 4
Likewise, to turn the weight off the command is:

ofw 1 2 3 4

Add Weight

The button will bring up a series of 4 entry boxes and in each box you have to type in either the layer number or the unit number for each node necessary to specify the weight you want to add. The order of the boxes is:

The typed version of the command to turn on the weight from layer 1 unit 2 to layer 3 unit 4 is:

ac 1 2 3 4  * ac is for add connection

Prune Weights Less Than

If you enter a value here the program will prune away weights smaller than the value you enter by marking them as not in use, in effect you get to turn off all the small weights. If you click the button it uses the current entry box value and goes through the network to turn off the weights. The program only searches for too small weights at this time and not while the training process is running. Pruning weights might be of value in finding a network with fewer weights, this is important because you always want to minimize the number of weights in order to improve generalization. I don't guarantee that this is the best way to do weight pruning, there are many other weight pruning techniques that may be better.

The typed version of the command to prune weights less than 5 is:

pw 5

Giving the Network a Kick

From time to time a network will get stuck with a poor solution, this is especially obvious in problems like xor where most of the patterns will be correct but others will give extremely bad results. One way to deal with this is to radically alter the weights and then resume training. There is a kick command that does this, weights greater than a certain value are decreased by a random amount and weights less than minus the value are increased by a random amount. The typed version of the command to go through the network and alter weights with a magnitude greater than 4 by using a random number between 0 and 2 is:

k 4 2
This means that if a weight is greater than 4 it should be decreased by a random value between 0 and 2 while if it is less than -4 it should be increaded by a random value between 0 and 2. The 4 is called the kick range and the 2 is called the kick size and there are entry boxes where you can set these values. You need to specifically select the "Apply the Kick" button to actually alter the weights. In the W95 version the kick size and kick range will be changed even without selecting the "Apply the Kick" button whereas in the UNIX/Tcl/Tk version these values only get set when the button is pressed.

Weights File Format

Weights can be saved in several formats. First the r format saves the weights in an ASCII file, second the R format saves the weights and other weight parameters in an ASCII file, third, b saves the weights in a binary format and finally B saves the weights and other weight parameters in a binary format. The commands are:

f w r
f w R
f w b
f w B

The great virtue of the binary formats is that you can reload EXACTLY the same values, writing values out as ASCII is liable to change them slightly.

IF YOU'RE GOING TO RESTART A PROBLEM FROM EXACTLY ITS CURRENT STATE AT A LATER DATE YOU MUST USE THE R OR B FORMAT NOT THE r OR b FORMAT because the extra weight parameters are necessary for the smooth functioning of the training algorithms.

Save Weights To:

You can click the button to save weights to the file named in the entry box or type a file name into the entry box and end with a carriage return. The typed version of the command to save weights to the file xor.wts is:

sw xor.wts
Note that this sets the current weights file name to xor.wts so if you save weights now with just the typed command:

sw
the file written to will be xor.wts. And if you simply type in:

rw
weights will be read from xor.wts.

Save Weights Every

If you fill in the entry box the program will write the weights to the weights file at the rate you specify, initially this rate is so large that its unlikely that weights will ever be saved. The typed command version to save weights every 100 iterations is:

swe 100

Save Weights Every Minimum

The program will save the weights whenever the status of the TEST SET is checked (the print rate in the r command) and the error put output unit is less than the previous error. This enables you to catch the best network as the training proceeds. If the number weights file option is on you will get a lot of files each with a different number. The typed command is:

swem

Read Weights From:

You can click the button to read weights from the file named in the entry box or type a file name into the entry box and end with a carriage return. The typed version to read weights from the file weights is:

rw weights
Again note that this sets the weights file name to weights so you could use the typed command:

rw
to re-read the file. Likewise you could save weights with:

sw
and this rewrites the file weights.

Read Weights, List Files

This button lists the files in the current directory and you double click the file of your choice in the UNIX/Tcl/Tk version or in the W95 version click the file name and then the "Read the File" button.

List Weights Leading into a Node

To see the weights that lead into a single node click the "List Weights into a Node" button. An entry box comes up and asks for the layer number where you enter which row of the network the unit is in (input is 1) and then another entry box comes up and asks for a unit counting from the left. For example to see the weights leading into unit 1 in layer 3, enter 3 then 1 and this gives the listing:

w 3 1
layer unit unit value  in use    weight   in use  input from unit

  1    1      1.00000    1      -4.97175    1         -4.97175
  1    2      1.00000    1       4.74848    1          4.74848
  2    1      0.00979    1      10.75158    1          0.10524
  3    b      1.00000    1      -2.50724    2         -2.50724
                                           sum =      -2.62526

This listing also gives data on how the current activation value of the node is computed using the weights and the activations values of the nodes feeding into unit 1 of layer 3. This listing is for the last pattern the network saw, if you're interested in one particular pattern then have the network evaluate that pattern before asking about the weights. The first inuse column indicates whether or not the unit is in use (1 or 0), the second inuse column is for the weight (1 is in use, 2 is a bias unit weight, 0 is not in use, -1 is a frozen weight). In the listing the `b' unit is the bias (also called the threshold) unit. The listing also gives the result of multiplying input unit and the weight in the last column and below the last column is the sum of the inputs.