Ordinary Least Squares

In the previous lesson on gradient descent, we explored an iterative technique for model fitting. The gradient descent algorithm requires multiple iterations to converge on the optimal parameter values and the number of iterations is highly dependent on the initial parameter values and the learning rate we select.

In this lesson, we’ll explore a technique called ordinary least squares estimation, or OLS estimation for short. Unlike gradient descent, OLS estimation provides a clear formula to directly calculate the optimal parameter values that minimize the cost function. To understand OLS estimation, we need to first frame our linear regression problem in the matrix form. If you need a refresher on Linear Algebra, we teach the fundamentals of Linear Algebra for Machine Learning.

We’ll also dive into the mathematical derivation of the OLS estimation technique. This technique is used in scikit-learn when you call `fit()` on a LinearRegression instance so it’s useful to find out what is going on behind the scenes!

As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there’s no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you’ve fully mastered each concept before learning the next concept.


  • Learn the theory behind the ordinary least squares algorithm.
  • Learn how to choose your model’s cost minimization function.

Lesson Outline

  1. Introduction
  2. Cost Function
  3. Derivative Of The Cost Function
  4. Gradient Descent vs. Ordinary Least Squares
  5. Next Steps
  6. Takeaways

Get started for free

No credit card required.

Or With

By creating an account you agree to accept our terms of use and privacy policy.