Professional Basis of AI Backprop Hypertext Documentation

Copyright (c) 1990-97 by Donald R. Tveter

Rprop

Overview

Rprop is short for resilient backprop, a variation on backprop created by Martin Riedmiller and Heinrich Braun, described in three online articles. Rprop is much like delta-bar-delta except the size of the weight change for each weight does not depend on the slope at the time, it only depends on the direction of the slope. Unlike the other accelerated training algorithms the learning rate does not need to be adjusted for the number of patterns used. Every weight change is a small initial weight value and then it is adjusted up or down automatically as the training proceeds. It appears to be one of the very best training algorithms because the default parameters seem to me to produce the fastest training for more problems than any other algorithm. On the other hand I know of one tiny case where it fails very badly so I would still advise you to try everything in order to get the best training times and best test set results.

Parameters

The acceleration parameter, a,

magnifies the size of the previous weight change by the given amount, the recommended default value is 1.2 however sometimes better results can be obtained with a different value.

the decay parameter, d,

shrinks the size of the next weight change by multiplying the previous change by the given amount. The recommended default value is 0.5, thus the weight change is cut in half. Other values may produce better results.

the initial value for the weight change, i

The recommended default value is 0.1, larger values may sometimes be better.

the maximum allowed weight change, M

There is no recommended value for this parameter so it is initialized to the fairly large value of 30. In most cases the weight changes will never get anywhere near this large but weight changes this large will probably wreck the training so if the training is going badly you should try limiting the maximum allowed weight change to smaller values, say 1 or 5 or less.

the minimum allowed weight change, m

The default value for the floating point version is 0.000001 and 0.001 for the integer version. Other values may produce better results.

Use

All the parameters can be listed on one line as in:

rp a 1.2 d 0.5 i 0.1 M 30 m 0.000001

References

Three postscript papers are available online ( one two three ) from the University of Karlsruhe, Germany.