MISSION 188

Guided Project: Creating a Kaggle Workflow

In this course, we learned Kaggle Fundamentals and built and trained machine learning models on data to submit to Kaggle. We also saw the effects that feature selection and model selection have on the accuracy of the predictions of a model. 

In this guided project, we're going to explore a workflow to make competing in the Kaggle Titanic competition easier, using a pipeline of functions to reduce the number of dimensions you need to focus on.

While practicing what you learned in this course, we'll be defining a Kaggle workflow for yourself. By defining a workflow for yourself, you can give yourself a framework with which to make iterating on ideas quicker and easier, allowing yourself to work more efficiently

Working on guided projects will give you hands-on experience with real world examples, so we encourage you to not only complete them, but to take the time to really understand the concepts.

These projects are meant to be challenging to better prepare you for the real world, so don't be discouraged if you have to refer back to previous lessons. If you haven't worked with Jupyter Notebook before or need a refresher, we recommend completing our Jupyter Notebook Guided Project before continuing.

As with all guided projects, we encourage you to experiment and extend your project, taking it in unique directions to make it a more compelling addition to your portfolio!

Objectives

  • Learn to use Jupyter notebook while working with Kaggle competitions.
  • Learn why workflows are important for machine learning, and create a Kaggle workflow.
  • Learn how to use functions to automate and simplify repetitive machine learning tasks.

Mission Outline

1. Introducing Data Science Workflows
2. Preprocessing the Data
3. Exploring the Data
4. Engineering New Features
5. Selecting the Best-Performing Features
6. Selecting and Tuning Different Algorithms
7. Making a Submission to Kaggle
8. Next Steps

kaggle-fundamentals

Course Info:

Intermediate

The median completion time for this course is 5.9 hours. View Details

This course requires a premium subscription, and includes three missions and one guided project.  It is the 28th course in the Data Scientist in Python path.

START LEARNING FREE

Take a Look Inside