Getting Started With Kaggle

In this first lesson of our Kaggle Fundamentals course, you’ll start learning how to compete in Kaggle competitions. In this introductory lesson, we'll learn how to: 

  • Approach a Kaggle competition
  • Explore the competition data and learn about the competition topic
  • Prepare data for machine learning
  • Train a model
  • Measure the accuracy of your model
  • Prepare and make your first Kaggle submission.

This lesson assumes you have an understanding of Python and the pandas library. If you need to learn about these, we recommend going through our Python Fundamentals course and our Numpy and Pandas course.

In this lesson and lessons to follow, we'll be working with RMS Titanic passenger data to predict which passengers survived the Titanic disaster. By the end of this lesson, you'll have created and trained your first Kaggle machine learning model.

Kaggle is a site where people create algorithms and compete against machine learning practitioners around the world. Your algorithm wins the competition if it's the most accurate on a particular data set. Kaggle is a fun way to practice your machine learning skills.

As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.


  • Learn how to approach a Kaggle competition and explore the competition data.
  • Learn techniques for cleaning and preparing data for machine learning.
  • Learn how to train a machine learning model and make your first Kaggle submission.

Mission Outline

1. Introduction to Kaggle
2. Exploring the Data
3. Exploring and Converting the Age Column
4. Preparing our Data for Machine Learning
5. Creating Our First Machine Learning Model
6. Splitting Our Training Data
7. Making Predictions and Measuring their Accuracy
8. Using Cross Validation for More Accurate Error Measurement
9. Making Predictions on Unseen Data
10. Creating a Submission File
11. Making Our First Submission to Kaggle
12. Next Steps
13. Takeaway


Course Info:


The median completion time for this course is 5.9 hours. View Details

This course requires a premium subscription, and includes three missions and one guided project.  It is the 28th course in the Data Scientist in Python path.


Take a Look Inside

(function(d) { d.addEventListener("DOMContentLoaded", function() { var pathname = d.location.pathname.replace(/^[/]|[/]$/g, "").replace("/", "-"); var tags = d.getElementsByTagName("iframe"); var type = pathname.startsWith("course") ? "?course=" : pathname.startsWith("path") ? "?path=" : null; if (type) { var i; for (i = 0; i < tags.length; i++) { if (tags[i].src.indexOf("signup#iframe") !== -1) { tags[i].src = tags[i].src.replace("#iframe", "") + type + pathname + "#iframe"; } } } }, false); })(document);