MISSION 135

Machine Learning Project Walkthrough: Making Predictions

In the previous lesson on preparing features for machine learning, we prepared a dataset by removing columns that had data leakage issues, contained redundant information, or required additional processing to turn into useful features. We also cleaned features that had formatting issues, and converted categorical columns to dummy variables. Our goal of preparing the dataset is to generate features from the data, which can be fed into a machine learning algorithm. The algorithm will make predictions about whether or not a loan will be paid off on time.

In this mission of this machine learning project course, we will use our clean dataset to build and train machine learning models that make accurate predictions about our data. We will use the error metrics we learned about in the Logistic Regression mission to assess the quality of our model. We will also use an algorithm called Random Forest to work with nonlinear data and learn complex conditionals.

To facilitate building machine learning models and making predictions, we will be working with financial lending data from Lending Club. Lending Club is a marketplace for personal loans that matches borrowers who are seeking a loan with investors looking to lend money and make a return.

As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.

Objectives

  • Learn how to choose an error metric.
  • Learn how to train and test your model using common machine learning algorithms.

Mission Outline

1. Recap
2. Picking an error metric
3. Picking an error metric
4. Class imbalance
5. Class imbalance
6. Logistic Regression
7. Cross Validation
8. Penalizing the classifier
9. Penalizing the classifier
10. Manual penalties
11. Random forests
12. Next Steps

machine-learning-project

Course Info:

Intermediate

The median completion time for this course is 6.6 hours. View Details

This course requires a premium subscription and includes three Machine Learning Project Walkthroughs.  It is the 25th course in the Data Scientist in Python path.

START LEARNING FREE

Take a Look Inside