“Convergence of Backward-Error-Propagation Learning in Photorefractive Crystals”

by Gregory C. Petrisor

December 1996

We analytically determine that the backward-error-propagation learning algorithm has a well defined region of convergence in neural learning-parameter space for two classes of photorefractive-based optical neural network architectures. The first class uses electric field amplitude encoding of signals and weights in a fully coherent system, whereas the second class uses intensity encoding of signals and weights in an incoherent/coherent system. Under typical assumptions on the grating formation in photorefractive materials used in adaptive optical interconnections, we compute weight updates for both classes of architectures. Using these weight updates, we derive a set of conditions that are sufficient for such a network to operate within the region of convergence. The results are verified empirically by simulations of the XOR sample problem. The computed weight updates for both classes of architecture contain two neural learning parameters: a learning rate coefficient and a weight decay coefficient. We show that these learning parameters are directly related to two important design parameters: system gain and exposure energy. The system gain determines the ratio of the learning rate parameter to decay rate parameter, and the exposure energy determines the size of the decay rate parameter. We conclude that convergence is guaranteed (assuming no spurious local minima in the error function) by using a sufficiently high gain and a sufficiently low exposure energy.

The system gain is composed of an optical component and an electronic component. We show that the optical component of the system gain is directly proportional to the exposure energy, and therefore decreasing the exposure energy for constant system gain corresponds to increasing the electronic component of the system gain. We show this energy decrease leads to noise (from the signal itself for ``ideal'' detectors or from the signal and the detection system for real world detectors) on the effective output of the detectors which in neural space corresponds to a variance of the neural-space signals. We compute the optimum ratio of the exposure energy used during forward and backward propagation to the exposure energy used during the weight updates, under the assumption that these two exposure energies are constant throughout learning. This maximizes the energy incident on the detectors for given neural-space signals, thereby minimizing the effects of detection noise. We derive the variance of the neural-space signals for both classes of architecture. This variance is a function of the following: the electron to neural-space gain of the detection system, the neural-space bias signals, and the effective noise of the detection system. Using simulation results of the XOR sample problem, we show that for sufficiently large neural-space signal variance the network will not converge irrespective of the status of the derived convergence condition. This is because this convergence condition is derived for the noise-free case and therefore does not account for the detrimental affects of detection noise on convergence. We conclude that for a given optical system it may not be possible to simultaneously satisfy the convergence condition while maintaining neural-space signal variances that are sufficiently small to allow the learning algorithm to converge. Based on this conclusion, we extrapolate the simulation results of the XOR sample problem to estimate the maximum size network that can be implemented in a 1.5 cm x 1.5 cm Fe:LiNBO3 photorefractive crystal. For the class of architecture that uses electric field amplitude encoding of signals and weights, we show that for problems types in which the convergence condition scales inversely with network size the maximum size network that can be implemented in this Fe:LiNBO3 crystal is on the order of the crystal's usable storage capacity.

1For the class of architecture which uses electric field amplitude encoding of signals and weights, we derive the scaling of the sufficient condition for convergence as a function of the spatial light modulator contrast-ratio. We present simulation results that support this derived result. We also present simulation results for the class of architecture which uses intensity encoding of signals and weights. The simulations results for this class of architecture indicate that the sufficient condition for convergence is only slightly effected by the spatial light modulator contrast-ratio.

We present a novel network and a variant of the backward error propagation learning algorithm for this network which can effectively implement ``bipolar'' output neuron units with unipolar neuron activation functions. We show that this network does not have the disadvantages associated with standard implementations of bipolar output neuron units in systems in which the physical-space signals and weights are constrained to be unipolar-a typical constraint in optical systems.

We present an experimental system which can be used to evaluate photorefractive based optical interconnections consistent with both types of optical architectures analyzed in this thesis. Preliminary experimental results from this evaluation system are presented.