In the previous lessons, we learned how the linear regression model estimates the relationship between the feature columns and the target column and how we can use that for making predictions. In this lesson and the next, we'll discuss the two most common ways for finding the optimal parameter values for a linear regression model. Each combination of unique parameter values forms a unique linear regression model, and the process of finding these optimal values is known as model fitting.

In this lesson, we'll explore an iterative technique for solving this problem, known as gradient descent. The gradient descent algorithm works by iteratively trying different parameter values until the model with the lowest mean squared error is found. Gradient descent is a commonly-used optimization technique for other models as well, like neural networks, which we'll explore later in our Deep Learning Fundamentals course.

As you work through each concept in this lesson on gradient descent in machine learning, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.


  • Learn about optimization problems.
  • Learn the theory behind the gradient descent algorithm.

Lesson Outline

1. Introduction
2. Single Variable Gradient Descent
3. Derivative Of The Cost Function
4. Understanding Multi Parameter Gradient Descent
5. Gradient Of The Cost Function
6. Gradient Descent For Higher Dimensions
7. Next Steps
8. Takeaways