Welcome to the first lesson of the Advanced Neural Tuning course. In this course, you will learn how to make your neural networks train more efficiently and achieve better results by using advanced optimization techniques. We will start with a key concept: learning rate scheduling.
The learning rate is a crucial parameter in training neural networks. It controls how much the model's weights are updated during each step of training. If the learning rate is too high, the model might not learn well and could even diverge. If it is too low, training can be very slow and might get stuck before reaching a good solution.
Learning rate scheduling is a technique in which you change the learning rate during training instead of keeping it constant. This can help your model learn faster at the beginning and fine-tune its weights as training progresses. In this lesson, you will learn how to use a popular learning rate scheduler in PyTorch called StepLR
.
The StepLR
scheduler is a simple but effective way to adjust the learning rate as your model trains. In PyTorch, StepLR
reduces the learning rate by a certain factor every fixed number of epochs. This helps the model make big updates early on and then smaller, more careful updates as it gets closer to a good solution.
The two main parameters for StepLR
are step_size
and gamma
. The step_size
tells the scheduler how many epochs to wait before reducing the learning rate. The gamma
parameter is the factor by which the learning rate is multiplied each time it is reduced. For example, if your initial learning rate is 0.1, your step_size
is 10, and your gamma
is 0.1, then after 10 epochs, the learning rate will become 0.01.
Choosing the right values for step_size
and gamma
depends on your dataset, model, and training dynamics:
step_size
: A smallerstep_size
(e.g., 5) means the learning rate will drop more frequently, which can help if your model quickly reaches plateaus or if you want to fine-tune early. A largerstep_size
(e.g., 20 or 30) keeps the learning rate high for longer, which can be useful for larger datasets or more complex models that need more time to learn before fine-tuning.gamma
: A smallergamma
(e.g., 0.1) means the learning rate drops sharply, which can help the model converge quickly after the drop. A largergamma
(e.g., 0.5 or 0.7) results in a more gradual decrease, which can be useful if you want to avoid sudden changes in training dynamics.
Why 10 and 0.1?
In the example, step_size=10
and gamma=0.1
are chosen to demonstrate a clear, noticeable drop in the learning rate every 10 epochs. These are common starting points, but you should experiment with these values based on your model’s performance. If you notice your model stops improving before the next scheduled drop, try reducing step_size
. If the learning rate drops too much and training stalls, try increasing gamma
.
Let’s look at how you can use StepLR
in a typical PyTorch training loop. Here is a code example that shows how to set up and use the scheduler:
Why is scheduler.step()
called after optimizer.step()
?
The order is important. optimizer.step()
updates the model parameters using the current learning rate. scheduler.step()
then updates the learning rate for the next epoch, based on the current epoch count. If you call scheduler.step()
before optimizer.step()
, the learning rate would be updated before the optimizer uses it, which can lead to off-by-one errors in your schedule. Always call optimizer.step()
first, then scheduler.step()
.
If you print the learning rate at each epoch, you would see something like this (assuming the initial learning rate is 0.1):
This shows how the learning rate drops at epochs 11 and 21, following the schedule you set.
In this lesson, you learned why the learning rate is important and how changing it during training can help your neural network learn better. You were introduced to the StepLR
scheduler in PyTorch, saw how it works, learned how to choose its parameters, and understood the importance of the order in the training loop.
Next, you will get a chance to practice using StepLR
yourself. In the following exercises, you will set up a scheduler, run a training loop, and observe how the learning rate changes over time. This hands-on practice will help you become comfortable with learning rate scheduling and prepare you for more advanced optimization techniques in the rest of the course.
