2

Structure of an Artificial Neural Network

A neuron on its own cannot solve a given problem. So we connect the neurons together in some fashion to form a neural network and this neural network will be powerful enough to solve different artificial intelligence problems, depending on the structure and complexity of the neural network used.

The particular class of neural network that we are going to use in this example is called the multilayer feedforward neural network.

Multilayer feedforward Neural Network

A multilayer feedforward neural network, as the name suggests consists of multiple layers of neurons. i.e., the neurons will be grouped into different layers and there will be an input layer and an output layer. Between these two layers of neurons there will be one or more hidden layers. This neural network is called feed forward network because the neurons in each layer pass the signals forward only to the neurons in the next layer.

Designing the neural network

Note that the number of neurons in each layer can be different. The number of neurons in each layer and the number of hidden layers are chosen considering different factors including the complexity of the problem in hand.

We are going to use a three layered neural network for our task - one input layer, one output layer and one hidden layer. Here are the number of neurons we are going to put in each layer:

  • 64 neurons in the input layer
  • 256 neurons in the hidden layer
  • 4 neurons in the output layer

As you can see these numbers indicate that our neural network is a fairly simple one. Now how did we reach these numbers? How can we calculate the number of hidden layers to be used and the number of neurons to be used in each layer?

Input layer

Now how many neurons do we need in the input layer? To answer this question, we need to analyze the details of the task at hand. As you know, we are going to give an image (of a character) as the input to the neural network. Now how do we give an image as an input to the neural network? Obviously, we must read the pixels in the image and give the pixel data as the input. Now you may realize that there is no need to send color information as input. It does not give any additional advantage, and it may bring in many processing headaches. So we will give just the on or off values of all the pixels in the image as an input to the neural network and we expect the neural network to tell us the character represented in the image.

Take a look at the following input image:

If the pixel value at any point (X,Y) is on (black), we will give an input 1 to the curresponding neuron. Similarly, if the pixel value is off (white), the input to the curresponding neuron will be 0.

After the image is pre-processed, the size of the image is reduced from 256*256 to 8*8. This means that there are 64 pixels in the input image and so we need 64 neurons in the input layer. Image pre-processing is very important for a variety of reasons and it is discussed in the next chapter.

Output layer

What is the output of the output layer of our neural network? We expect it to tell us the character curresponding to the input image given. Obviously, each neuron in the output layer will be giving the values 0 or 1 as its output. This means that the output what we get from the output layer is a series of 0s and 1s. Assume that we want to recognise just the first 16 charaters in the English alphabet. Now the question is how many bits (0s or 1s) are needed to represent 16 items?

To represent 16 things, we need 4 bits (24=16). This means that just 4 output neurons are necessary in our case. You can modify this same program to recognise all the characters of the English alphabet. For that you need to have 5 neurons in the output layer.

Hidden layer(s)

Usually for fairly simple classification tasks like Optical Character Recognition (OCR), we start with something like 1 or 2 hidden layers for the neural network. Experience has shown that adding more hidden layers for this application do not increase the accuracy of the results in any manner. One drawback of using more hidden layers is that the training of the neural network may become extremely slow. That is why we are going to use just one hidden layer.

Most of the processing and recognition is controlled by the neurons in the hidden layer. This means that having a comparatively large number of neurons in the hidden layer will increase the accuracy of the recognition (Though this effect ceases to apply after a certain limit). If you have to use a very huge number of neurons, you should consider splitting up the neurons into different hidden layers rather than fitting them together in the same layer.

For our application we are using 256 neurons in the hidden layer.

The final structure of our neural network will look like this:

One pixel each from the input image will be fed to one neuron of the input layer. The output signal from each neuron in the input layer will be sent to the inputs of all the neurons in the hidden layer. Similarly, the output signal from each neuron in the hidden layer will be sent to the inputs of all the neurons in the output layer. The neorons in the output layer will output one bit each of the result. Here is the header file defining our neural network (neuralNetwork.h):

// file: neuralNetwork.h

const int NO_INPUT=64;    // The number of Input neurons
const int NO_HIDDEN=256;  // The number of Hidden neurons
const int NO_OUTPUT=4;    // The number of Output neurons

// The length of one side of the input picture square in pixels.
const int inputSquareSize=8;

// The class that implements the Neural Network
class neuralNetwork
{
	// The array of inputs to the Input Layer of neurons
	// They are read from the picture. So they will be integers - (0 or 1)
	int Input_To_InputLayer[NO_INPUT];

	// The array of Outputs from the Input Layer of neurons.
	// This array is given as input to the Hidden Layer of neurons.
	double Output_Of_InputLayer[NO_INPUT];

	// The array of Outputs from the Hidden Layer of neurons.
	// This array is given as input to the Output Layer of neurons.
	double Output_Of_HiddenLayer[NO_HIDDEN];

public:

	// The array of Outputs from the Output Layer of neurons.
	// The array is used to find the output of the Neural Network.
	double Output_Of_OutputLayer[NO_OUTPUT];

	// The learning rate of the neural network. This specifies how fast the
	// neural network learns when being trained. When the learning rate is high,
	// the network learns fast, but the accuracy of learning will become low.
	double Learning_Rate;

	// The neurons used in the network are declared.
	// Input Layer neurons are not declared because they only serve as a layer
	// which forwards all the input to the hidden layer. This is implemented 
	// in this program using simple loops to make the program more efficient.
	Neuron hiddenNeurons[NO_HIDDEN];
	Neuron outputNeurons[NO_OUTPUT];

	neuralNetwork() // Constuctor of the neural network
	{
		// The learning rate is set here. It may be changed during the training to
		// achive faster training.
		Learning_Rate = 10;
	}

	// The folowing function finds the output of the fist layer.
	// The output is stored in the array: Output_Of_InputLayer[]
	void Find_Output_InputLayer(int *);

	// The folowing function finds the output of the second layer.
	// The output is stored in the array: Output_Of_HiddenLayer[]
	void Find_Output_HiddenLayer();

	// The folowing function finds the output of the ouput layer.
	// The output is stored in the array: Output_Of_OutputLayer[]
	void Find_Output_OutputLayer();
	
	// This function calls all other functions in the required order to 
	// recognize the given input character.
	char Recognize(int *);

	// Training: Give an input image bitmap and the curresponding character.
	double Train(int * Input_Array, int expectedCharacter);

	// Return an integer curresponding to the character recognised.
	int Output();

	void SetLearning_RateBy(double);
};

The definitions of some of the methods are listed below (neuralNetwork.cpp). The training method will be discussed later.

// file: neuralNetwork.cpp

// Refer the 'neuralNetwork.h' file for details on the functions and variables.
void neuralNetwork::Find_Output_InputLayer(int * Input_Array){
	// The intput is  fed directly to the output
	for(int i=0; i<NO_INPUT; i++){
		Output_Of_InputLayer[i] = (double)Input_Array[i];
	}
}

void neuralNetwork::Find_Output_HiddenLayer(){
	for(int i=0; i<NO_HIDDEN; i++){
		Output_Of_HiddenLayer[i] = 
				hiddenNeurons[i].Output(Output_Of_InputLayer, NO_INPUT);
	}
}

void neuralNetwork::Find_Output_OutputLayer(){
	for(int i=0;i<NO_OUTPUT;i++){
		Output_Of_OutputLayer[i] = 
				outputNeurons[i].Output(Output_Of_HiddenLayer, NO_HIDDEN);
	}
}

//This is the recognize function that calls all other 
//functions to recognize the given pattern
char neuralNetwork::Recognize(int  * Input_Array){
	Find_Output_InputLayer(Input_Array);
	Find_Output_HiddenLayer();
	Find_Output_OutputLayer();
	// At this stage, the array Output_Of_OutputLayer[4 decimal numbers]
	// contains the result of the Neural Net operation.
	// This function reads the array and interprets the output.
	return Output();
}

// Returns an index to the array: "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
int neuralNetwork::Output(){
	int index = 0;
	int i,j=1;
	for(i = 0; i < NO_OUTPUT; i++,j *=2){
		index += j * (Output_Of_OutputLayer[NO_OUTPUT-(i+1)] > 0.5 ? 1 : 0);
	}
	return  index;
}

void neuralNetwork::SetLearning_RateBy(double Value){
	Learning_Rate = Value;
}

Modifying the parameters

As you have seen, none of the parameters or dimensions of a neural network are fixed. It will change from application to application. You have to start out with a standard (or speculated) model for your particular application and then go on with modifying the structure until it works best for you. You can see some more details in the advanced topics section.