Forward Propagation through a Layer

Introduction

Welcome back to our course "Neural Network Fundamentals: Neurons and Layers"! You're making excellent progress. Previously in this course, we built a single artificial neuron, which we then enhanced with the Sigmoid activation function and finally combined multiple neurons into a DenseLayer.

Today, in our final lesson, we'll bring our neural network layer to life by implementing forward propagation — the process by which information travels through a neural network from input to output. This is where our layer actually processes data and produces meaningful activations.

By the end of this lesson, you'll have a functional dense layer that can take inputs, process them through multiple neurons simultaneously, and produce outputs that could be fed to another layer or used directly for predictions.

Understanding Forward Propagation

Before diving into code, let's understand what forward propagation means in the context of neural networks.

Forward propagation (or forward pass) is the process of taking input data and passing it through the network to generate an output. It's called "forward" because information flows forward through the network, from the input layer, through any hidden layers, to the output layer.

For our dense layer, forward propagation involves three key steps:

Weighted sum calculation: Multiply each input feature by its corresponding weights;
Bias addition: Add the bias term to each weighted sum;
Activation: Apply the activation function to introduce non-linearity.

This process transforms the input data into activations (outputs) that represent what the layer has "learned" about the input. These activations can then be used as inputs to subsequent layers in deeper networks.

Matrix Operations for Efficient Processing

As we learned in our previous lesson, we're using matrices to represent the weights and biases of our dense layer. This allows us to process multiple neurons simultaneously through efficient matrix operations.

Let's review the key matrices involved in forward propagation:

Inputs: Shape (n_samples, n_inputs), each row represents one data sample.
Weights: Shape (n_inputs, n_neurons), each column represents the weights for one neuron.
Biases: Shape (1, n_neurons), one bias per neuron.
Outputs: Shape (n_samples, n_neurons), each row will be the output for one sample

The beauty of this approach is that it works not only for a single input sample but also for batches of samples. When we have multiple samples (a batch), we can process them all at once without writing additional code.

For example, if we have a batch of 4 samples, each with 3 features, and our layer has 2 neurons:

Inputs shape: (4, 3), which means 4 samples, each with 3 features;
Weights shape: (3, 2), meaning 3 inputs connected to 2 neurons;
Result of matrix multiplication: (4, 2), or 4 samples, each producing 2 outputs.

This efficient batch processing is one of the key advantages of using matrix operations in neural networks.

Implementing the Forward Method

Now, let's implement the forward method in our DenseLayer class. We'll focus first on calculating the weighted sum:

In this code:

The forward method takes an input array, which can contain multiple samples.
We use NumPy's dot function to perform matrix multiplication between the inputs and weights.
This operation computes the weighted sum for every neuron in the layer, for every sample in the batch.

The matrix multiplication creates a new matrix where each element is the dot product of a row from the inputs and a column from the weights. Each row in the result corresponds to a sample, and each column corresponds to a neuron.

Adding Biases and Applying Activation

After calculating the weighted sum, we need to add the biases and apply the activation function:

Here's what happens in this code:

We add the biases to the weighted sum. Notice that the biases have a shape of (1, n_neurons), while the weighted sum has a shape of (n_samples, n_neurons). NumPy's broadcasting feature automatically applies the bias to each sample.
We apply the sigmoid activation function to the result. This transforms every value in the matrix to be between 0 and 1, introducing non-linearity.
We store the result in self.output and also return it, making it available for the next layer or for the final prediction.

This completes the forward propagation process for our dense layer. Each neuron in our layer has now processed the input data, and we have the resulting activations.

Processing Data in Batches

One significant advantage of our implementation is that it can process data in batches. A batch is simply a collection of multiple input samples that we process simultaneously. This is much more efficient than processing samples one at a time.

Let's see how our code handles a batch of input data:

In this example:

We have a batch of 2 samples, each with 3 features
Our layer will process both samples at once
The output will have 2 rows (one for each sample) and as many columns as there are neurons in our layer

This batch processing capability is essential for efficient training and inference in neural networks, especially when dealing with large datasets.

Visualizing the Forward Pass

To solidify our understanding, let's visualize the forward propagation process step by step for a single sample:

Input: [1.0, 2.0, 3.0] (shape: (1, 3))
Weights: (example)
(shape: (3, 2))
Weighted sum:
Add biases: [0.31, 0.29] + [0, 0] = [0.31, 0.29]
Apply sigmoid:

The final output [0.577, 0.572] represents the activations of the two neurons in our layer for the first sample. The exact numbers will vary based on the random initialization of weights, but this process illustrates how a single sample flows through the layer.

Conclusion and Next Steps

Congratulations! You've successfully implemented forward propagation in your dense layer. This is a significant milestone, and it marks the completion of the lessons in our "Neural Network Fundamentals: Neurons and Layers" course. Your layer can now accept input data (either a single sample or a batch), process it through multiple neurons simultaneously, and produce output activations that are ready for further processing. The forward pass is the foundation of how neural networks make predictions, transforming raw inputs into meaningful outputs through a series of mathematical operations.

Coming up next, you'll have the opportunity to put everything you've learned in this course into practice in the final practice section. You'll work with the code we've developed, experimenting with different inputs and seeing how they flow through the layer.

This will prepare you for the next step in our journey: continuing your neural network implementation in our second course, "The MLP Architecture: Activations & Initialization". There, we'll build upon the foundations laid here, focusing on connecting multiple layers together—using the activations your layer now produces—to form complete neural networks capable of solving complex problems. Happy learning!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal