Backpropagation in Neural Networks

Introduction

Hello! In this lesson, we will explore the inner workings of the crucial backpropagation algorithm for training neural networks and implement it from scratch in C++. By the end of this lesson, you will understand how backpropagation works and how to build a simple neural network using C++.

The Structure of a Neural Network

A neural network is composed of an input layer, one or more hidden layers, and an output layer. Each layer contains neurons (nodes) that are connected to the next layer through weighted links. These weights, along with bias terms, determine the output of the network.

In our C++ implementation, we will use the Eigen library for matrix operations. The input data is stored in an Eigen::MatrixXd called input. The weights connecting the input layer to the hidden layer are stored in weights1, and the weights connecting the hidden layer to the output layer are stored in weights2. The hidden layer will have four neurons, and the output layer will have one neuron.

Understanding the Sigmoid Function

The activation function we will use is the sigmoid function, which maps real-valued numbers into the range between 0 and 1. Its mathematical definition is:

$sigmoid(x) = \frac{1}{1+e^{-x}}$

Defining a Neural Network

We will define a NeuralNetwork class in C++. The class will store the input data, weights, target outputs, and other necessary variables as member variables. The constructor will initialize the weights randomly and set up the matrices for input and output.

input: Matrix containing the input data.
weights1: Weights from the input layer to the hidden layer (randomly initialized).
weights2: Weights from the hidden layer to the output layer (randomly initialized).
y: Matrix containing the target outputs.
output: Matrix to store the network's output.
layer1: Matrix to store the output of the hidden layer.
learning_rate: Controls how much the weights are updated during training.

Feedforward Propagation

Feedforward propagation is the process of passing input data through the network to generate an output. This involves multiplying the inputs by the weights, applying the activation function, and repeating the process for each layer.

Here is how you can implement feedforward propagation in C++:

The input is multiplied by weights1 to get the input to the hidden layer.
The sigmoid function is applied to get the hidden layer's output.
The hidden layer's output is multiplied by weights2 to get the input to the output layer.
The sigmoid function is applied again to get the final output.

The Essence of Backpropagation

Backpropagation is the key algorithm for training neural networks. It works by propagating the error from the output layer back through the network and adjusting the weights to minimize the difference between the predicted output and the actual output.

The weight update rule is:

$\Delta w_{ij} = \eta \times e_{j} \times x_{i}$

Implementing Backpropagation

In C++, we can implement backpropagation by calculating the error at the output, propagating it back to the hidden layer, and updating the weights accordingly.

output_error computes the error at the output layer.
d_weights2 calculates the adjustment for the weights between the hidden and output layers.
hidden_error computes the error at the hidden layer.
d_weights1 calculates the adjustment for the weights between the input and hidden layers.
The weights are updated by adding the product of the learning rate and the calculated adjustments.

Calculating the Error: Squared Error Loss

In our implementation, we use the squared error loss (also known as mean squared error for a single sample) to measure how far the network's predictions are from the actual target values. The squared error for a single output is calculated as:

\text{Error} = (y_{\text{true}} - y_{\text{pred}})^2

Epochs in Neural Network Training

An epoch is a single pass through the entire training dataset. Training the network for multiple epochs allows it to gradually adjust its weights to minimize the error.

Here is how you can implement the training loop in C++:

The train function repeatedly calls feedforward and backprop for the specified number of epochs.

End-to-End Example: XOR Problem

Let's put everything together and solve the XOR (exclusive OR) problem using our neural network in C++. The XOR problem is a classic test for neural networks, as it is not linearly separable.

The input matrix X contains all possible pairs of binary inputs for the XOR problem.
The output matrix Y contains the expected results.
The neural network is trained for 5,000 epochs.
After training, the network's predictions for each input are printed.

Lesson Summary and Practice

Congratulations! You have learned how the backpropagation algorithm works and how to implement a simple neural network from scratch in C++. By understanding the structure of neural networks, the role of activation functions, and the process of training through feedforward and backpropagation, you are now equipped to experiment with and extend neural networks for a variety of problems. Keep practicing and exploring the fascinating world of deep learning with C++!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal