Last time we built a single neuron that calculates: (input × weight) + bias
. But there's a crucial final step we skipped!
Without this step, our neuron can only learn straight lines. That's like trying to draw a portrait with only a ruler.
Engagement Message
What extra step after (input × weight) + bias do you think would let the neuron draw curves?
The missing piece is a (non-linear) activation function! This takes our neuron's calculation and transforms it into the final output.
Think of it like adding curves to your straight-line drawing toolkit. Now you can create much more interesting shapes!
Engagement Message
Why do you think curves might be more useful than straight lines?
Let's meet our first activation function: sigmoid.
Engagement Message
Sigmoid takes any number and squashes it between 0 and 1, creating a smooth S-shaped curve. Large positives become close to 1, large negatives close to 0, and zero becomes 0.5. What situations might need outputs between 0 and 1?
Next is ReLU (Rectified Linear Unit). It's beautifully simple: if the input is positive, output that number; if negative, output zero.
ReLU outputs zero or positive values, creating a sharp corner at zero.
Engagement Message
When is a zero-or-positive output like ReLU handy?
Our third function is tanh (hyperbolic tangent). Like sigmoid, it creates an S-curve, but it outputs values between -1 and 1 instead.
Zero still gives zero, but now we have both positive and negative outputs with a nice smooth transition.
Engagement Message
With me so far?
Here's the amazing part: when you stack layers of neurons with these non-linear activations, they can approximate virtually any function!
This is called the Universal Approximation Theorem. If a phenomenon can be described mathematically, a neural network can, in theory, learn to model it!
Engagement Message
What complex patterns do you think neural networks might learn?
Type
Sort Into Boxes
Practice Question
Let's match activation functions with their key characteristics:
Labels
- First Box Label: Sigmoid
- Second Box Label: ReLU
First Box Items
- S-shaped curve
- Outputs 0 to 1
- Smooth transitions
Second Box Items
- Simple max function
- Sharp corner at zero
- Always non-negative
