Course

You'll learn how to:

Start this course today

Build hands-on data skills with interactive exercises and projects.

Sign up

About this course

In our Parallel Processing course, you’ll learn how to improve the performance of your code by processing data in parallel rather that iterating through rows sequentially.

You’ll learn how to start multiple processes and run functions on multiple processes at the same time, as well as share data between multiple processes. You’ll learn how use a process pool executor, practicing all of these skills as you dig into some data about the demand for data engineering jobs.

Then you’ll learn about MapReduce. You’ll learn how to use process pools, how to actually implement MapReduce, and how to effectively process data with it.

At the end of the course, you’ll complete a project using your new skills that challenges you to dig into data from Wikipedia pages and analyze them quickly and efficiently using MapReduce.

In this parallel processing course, you will:

  • Learn how to process data in parallel
  • Learn how to implement MapReduce
  • Learn how to solve problems with MapReduce

Lessons in this course

Loading lessons....

Thousands of learners have changed their careers with Dataquest

97%

Learners who recommend
Dataquest for career advancement

4.9 stars

Dataquest rating on
G2Crowd and SwitchUp

$30k

Average salary boost
for learners who complete a path

Join a community of 1M+ data learners on Dataquest

1

Sign up for a free account

Get access to hundreds of free lessons.

2

Choose a course or path

Start anywhere, from beginner topics to advanced concepts.

3

Learn with hands-on exercises

Learn with real data and build your experience.

Apply your skills

Create projects, build your portfolio, and build your career.

Sign up today

or