Neural Networks
Neural Networks (NN), or more accurately Artificial Neural Networks (ANN), are an attempt to simulate the physical structure of the brain using software in order to create a system that can learn. In order to explain we will first take a quick lesson in the biology (and physiology) of the brain so we know what we are trying to duplicate.
The Blueprint - how it fits together
First we must know what a biological neural network is, in order to make an artificial one.
The main structural parts to a neuron are the soma, dendrites and axon. The soma and dendrites make up the main body of the neuron and the axon is the long structure that connects a neuron with other neurons. The point at which a neuron's axon connects with another neuron (at the soma or dendrite) is called a synapse. The neuron has many other structures, as do all cells (nucleus, cytoplasm etc...), but we can ignore those for this discussion. [1]
While a neuron may connect to many other neurons, it still only has one axon leaving the soma. It isn't until the axon gets away from the body of the neuron that it splits and forms collateral axons. [1]
Neurons form circuits by connecting to other neurons with their axons. There are different kinds of circuits in function and form, but the most commons forms are hierarchical and single-source/divergent. The hierarchical are typified by having many neurons sending signals to a single neuron which then passes it on to a single neuron and so on up (or down) the chain. Often at the other end of the chain a divergence will happen and many neurons will get notified. For instance, the motor system is arranged this way. Neruons in the brain converge towards specific motor cells in the spinal cord which ultimately communicate with many sets of muscles at the other end. [1]
The single-source/divergent pattern happens when a single neuron has many connections. When it gets the impulse to fire it does so to many neurons. This allows a neuron to influence a great range of functions, bridging sensory, motor and other systems. It is able to integrate across the boundaries and speak directly to multiple hierarchical systems. [1]
The Model - how it works
Now that we know how they fit together we can discuss how they communicate with each other.
As described above, neurons are connected via axons at synapses. It is at the synapses where the communication occurs. The communication is triggered by the release of chemicals (neurotransmitters) from the axon of the presynaptic neuron and the subsequent reception of those chemicals at receptor sites on the soma or dendrite of the postsynaptic neuron. The message passed can be excitatory or inhibitory. The first message is designed to make the postsynaptic neuron continue firing the message. The second is designed to keep the neuron from firing. [1]
A single neuron can have connections from many other neurons and it may actually take excitatory messages from more than one to cause it to fire. There may be inhibitory messages coming from some neurons and excitatory messages from others. In biological neurons these interactions all deal with how much and what type of chemicals are released and received at the synapses. [1]
Once a postsynaptic neuron reaches a threshold value of neurotransmitters (the chemicals transfered at the synapses) a action potential is created that travels through the neuron, down the axon and causes the neuron to fire it's own neurotransmitters. Thereby making the switch to being a presynaptic neuron at the synapses of other neurons. Neurons can have axons that terminate at a single neuron or they may branch and connect to as many as 10,000 neurons. [1]
The Code - how we do it on computers
Our ANN will work in much the same way as the biological version, without the complications of neurotransmitters and chemical reactions. A neuron will accept inputs from multiple neurons. Those inputs will have weights associated with them. The summation of the inputs multiplied by their associated weight will be called the activation. To get the output the activation is processed by applying it to a formula.
The formula to process the activation will be either a step function or a sigmoid function. In reality it can be any kind of function you want, but these are the two most common, with the sigmoid being the favorite choice. It has the benefit of always generating an output, so if the NN is hooked up to something that takes a variable input, like for instance a power setting, this works great.
When using the sigmoid function the result will be that our neuron will always give an output. The output will vary between 0 and 1. a, in the equation is the activation value and p controls the shape of the curve. p is normally set to 1.0. As p gets smaller the graph starts to look like a step function. When the activation is negative the output is less than 0.5, positive gives output above 0.5.
Finally, lets put several neurons together to form an actual neural network. Unlike the biological NN we will organize our NN into layers. Each neuron in a layer accepts the entire set of outputs from the previous layer. This kind of architecture is known as a feed forward network.
Each node takes the previous set of outputs, applies the weights to get the activation, generates an output and passes it on. The last layer generates the output that will be interpreted by the calling program. The output can be in many forms. They could generate a binary encoding for a number, or could be switches - if output1 is set then the response is yes. The specific outcome is dependent on the problem domain of course.
Algorithms
Without constructing an entire ADT, lets look at how we would programmaticaly get to the ouput of a Neural Network.
The first step is to compute the activation. Each neuron needs to accept the inputs and then apply the weights (stored internally in the neuron) to them.
computeActivation( [in] inputArray, [out] activationValue ) {
for ( int i = 0; i < inputArray.size; i++ ) {
activationValue += inputArray[i] * m_WeightArray[i];
}
}
Next the output needs to be generated. I am going to give the ability to choose either a step function of the sigmoid function here.
computeOutput( [in] activationValue, [in] isSigmoidBoolean, [out] outputValue ) {
if ( isSigmoidBoolean ) {
outputValue = = 1 / (1 + power(GLOBAL_CONSTANT.E, (((-1)activationValue)/MEMBER_CONSTANT_P);
}
else {
if ( activationValue > m_ThresholdValue )
outputValue = 1;
else
outputValue = 0;
}
}
That's basically all the steps for the processing of inputs. Pretty simple really. The complexity enters the equation when it is time to teach the neural network. Initially the weights are randomly set, only with some refinement will the NN actually do what it is supposed to do.
Making it learn
How does the biological version do it? The process of training or learning changes the characteristics of receptors at the synapses of postsynaptic neurons. This causes the neuron to be more (or less) sensitive to specific neurotransmitters. That in turn means that external stimuli are more (or less) likely to cause a series of reactions in neurons, or in a particular set of neurons, to generate a biological reaction.
Our artificial neurons will learn by changing the weights corresponding to the connections from other neurons. This will simulate the change in sensitivty to the inputs as seen in the biological model.
One way to change the weights in the neurons is through the process of back-propogation. In this process the final output of the NN is compared to the expected output. The error is computed and fed backwards through the NN. At each node the error causes a small change to the weight of each input. The size of the change corresponds to the magnitude of the error.
I'm in the process of working up the formulas for the back propogation. It gets a little heavy on the math side and I'm brushing up on my calculus as I do this. If your calculus is up to speed I suggest going over to [3] and checking out his section on training. It's a little over my head still because I haven't done derivatives in about 6 years and have forgotten the concepts behind them. But since I actually want to write one of these up I am going to figure it out and will be posting it here in "plain programmer english".
References
[1] Floyd E. Bloom, Arlyne Lazerson, Brain Mind and Behavior 2nd edition, p30-53 p240-269, W. H. Freeman and Company, 1988[2] Neural Networks in Plain English
[3] Back-Propagation Neural Network Tutorial
[4] Detailed Back Propagation
[5] Image of Neuron
[6] Dan W. Patterson, Introduction to Artificial intelligence & Expert Systems p327-380, Prentice Hall, Inc., 1990

