Feature Preparation, Selection, and Engineering
In the previous lesson, we made our first sublesson to Kaggle. Kaggle is a site where people create algorithms and compete against machine learning practitioners around the world. Your algorithm wins the competition if it’s the most accurate on a particular data set. Using Kaggle and this Kaggle Fundamentals course, you will have a fun way to practice your machine learning skills.
In this lesson, we’re going to focus on working with the features used in the model to boost the accuracy of our predictions. To do this, we’ll start by looking at feature selection. Feature selection is important because it helps to exclude features which are not good predictors, or features that are closely related to each other.
As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there’s no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you’ve fully mastered each concept before learning the next concept.
- Learn how to determine which features in your model are the most-relevant to your predictions.
- Learn ways to reduce the number of features used to train your model and avoid overfitting.
- Learn techniques to create new features to improve the accuracy of your model.
- Preparing More Features
- Determining the Most Relevant Features
- Training a model using relevant features.
- Submitting our Improved Model to Kaggle
- Engineering a New Feature Using Binning
- Engineering Features From Text Columns
- Finding Correlated Features
- Final Feature Selection using RFECV
- Training A Model Using our Optimized Columns
- Submitting our Model to Kaggle
- Next Steps