Decision trees are known in the machine learning world for a particularly distinctive characteristic: their visualizations are easier to understand compared to other machine learning models, and for this reason, they are very suitable for explaining insights to non-technical audiences.
In this course, you’ll learn the foundations of Decision Trees including identifying the key components of trees, interpreting them, classifying new observations using decision trees and calculating optimal thresholds for both classification and regression trees. You’ll also learn how to build and visualize decision trees by adapting a real-life dataset to train tree models, selecting the appropriate scikit-learn tools to build your model, and training, testing and visualizing decision trees.
You’ll be able to evaluate and optimize trees for better performance including activities such as establishing the optimal depth for a decision tree, using Prune decision trees to avoid overfitting, or manipulating sample distribution in nodes and leaves.
Finally, you’ll learn how to apply the cross validation and ensemble techniques for decision trees. You’ll identify the differences between decision trees and random forest models, develop and customize random forest models and optimize the parameters of random forest.
Best of all, you’ll learn by doing — you’ll practice and get feedback directly in the browser. At the end of the course, you’ll combine your new skills in a project to predict employee productivity with tree models
- Creating, customizing, and visualizing Decision Trees
- Using and interpreting Decision Trees on new data
- Optimizing trees by altering their parameters
- Applying the Random Forest prediction technique
Decision Tree and Random Forest Modeling in Python [5 lessons]
- Apply cross validation techniques to decision trees.
- Find the optimal parameters of decision trees using grid search.
- Identify the differences between decision trees and random forest models.
- Develop and customize random forest models.
- Optimize the parameters of random forest.
- Identify the differences between random forest and extra trees.
- Determine the advantages and disadvantages of decision trees.
- Clean and adapt a dataset for use in a decision tree
- Build and visualize a decision tree to determine key features
- Evaluate your trees using different metrics
- Optimize trees by adjusting their parameters
- Explain the results of a tree model to a non-technical audience
Projects in this course
Guided Project: Predicting Employee Productivity Using Tree Models
In this project, using the “Productivity Prediction of Garment Employees” dataset from the UCI Machine Learning Repository, we’ll determine the best working conditions to reach the expected productivity thresholds in a garment factory.
The Dataquest guarantee
Dataquest has helped thousands of people start new careers in data. If you put in the work and follow our path, you’ll master data skills and grow your career.
We believe so strongly in our paths that we offer a full satisfaction guarantee. If you complete a career path on Dataquest and aren’t satisfied with your outcome, we’ll give you a refund.