Hello! Today, we will learn about a powerful technique that makes our Gradient Descent move faster, like a ball rolling down a hill. We call this "Momentum".
Momentum improves our Gradient Descent. How does it do that? Remember how a ball on top of a hill starts rolling down? If the slope is steep, the ball picks up speed, right? That's what momentum does to our Gradient Descent. It makes it move faster when the slope (our 'hill') points in the same direction over time.
Let's get down to coding! Here's a little piece of code to demonstrate the effect of momentum in a gradient descent process. We will use a gradient function, grad_func(). The weight or parameter (theta) starts at a point and moves down the slope by adjusting itself in every iteration or 'epoch':

