Course overview

Big data is all around us, and Spark is quickly becoming an in-demand Big Data tool that employers want to see.
In this course, you’ll learn the advantages of Apache Spark. You’ll learn concepts such as Resilient Distributed Datasets (RDDs), Spark SQL, Spark DataFrames, and the difference between pandas and Spark DataFrames.

You’ll also learn how to install Spark and PySpark, a Python API that allows you to interact with Spark using Python code. You’ll learn how to integrate PySpark with Jupyter Notebook so you can analyze large datasets.

Best of all, you’ll learn by doing — you’ll practice and get feedback directly in the browser. You’ll work with a variety of real-world datasets, including the text of Hamlet, census data, and guest data from The Daily Show.

Key skills

    • Breaking down tasks using the map-reduce framework
    • Processing and transforming larger, raw files using Spark
    • Working with large, unstructured datasets using Spark SQL and Spark DataFrames

Course outline

    Loading lessons...

Projects in this course

The Dataquest guarantee

  • Dataquest has helped thousands of people start new careers in data. If you put in the work and follow our course, you'll master data skills and grow your career.

  • We believe so strongly in our courses that we offer a full satisfaction guarantee. If you complete a career course on Dataquest and aren't satisfied with your outcome, we'll give you a refund.

Master skills faster
with Dataquest

Go from zero to job-ready

Learn exactly what you need to achieve your goal. Don’t waste time on unrelated lessons.

Build your project portfolio

Build confidence with our in-depth projects, and show off your data skills.

Challenge yourself with exercises

Work with real data from day one with interactive lessons and hands-on exercises.

Showcase your course certification

Impress employers by completing a capstone project and certifying it with an expert review.

Learning resources

How to Learn
Data Science

Learn a proven method to acquire and use data skills.

Read Article

Dataquest Reviews

Read more about what learners think of Dataquest.

Read Article

How To Learn Python

A step by step guide to learn and master Python.

Read Article

Python Practice

65 different Python practice exercises you can start today.

Read Article

Learner testimonials


Learners who recommend
Dataquest for career advancement


Dataquest rating on
G2Crowd and SwitchUp


Average salary boost
for learners who complete a path

Join 1M+ data learners on

  • 1

    Sign up for a free account

  • 2

    Choose a course or path

  • 3

    Learn with hands-on exercises

  • 4

    Apply your skills

Start learning with a free account today.

Sign Up For Dataquest