While exploring logistic regression, we briefly mentioned overfitting and the problems it can cause. In this lesson, we'll explore how to identify overfitting and what you can do to avoid it. To explore overfitting, we'll use a dataset about cars that contains seven numerical features that could have an effect on a car's fuel efficiency.

In this mission, we will discuss two observable sources of error in a model that we can indirectly control: bias and variance. We'll also discuss overfitting at a deeper level and explore a good way to detect if a model is showing signs overfitting and look at an example of a model that is showing signs of overfitting. You will also get introduced to related terminology that you'll see in other literature as you read more about overfitting. For more information about the bias-variance trade-off, you can read more about it in our blog post here.

As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.


  • Learn how to detech overfitting for a model.
  • Learn how to understand the bias-variance tradeoff.

Mission Outline

1. Introduction
2. Bias and Variance
3. Bias-variance tradeoff
4. Multivariate models
5. Cross validation
6. Plotting cross-validation error vs. cross-validation variance
7. Conclusion
8. Next steps
9. Takeaways


Course Info:


The median completion time for this course is 5.6 hours. View Details

This course requires a premium subscription and includes 1 free mission and 5 paid missions, which includes 1 guided project.  It is the 21st course in the Data Scientist in Python path.


Take a Look Inside