Course

Building a Data Pipeline

In this course, you’ll learn how to build data pipelines using Python. These automated chains of operations performed on data will save you time and eliminate repeating tasks. By the end, you’ll know how to write a robust data pipeline with a scheduler using the versatile Python programming language.

Enroll for free

Part of the Data Engineer (Python) path.

4.8 (359 reviews)

11,251 learners enrolled in this course.

Intermediate friendly
4 hours
Self paced
5 lessons
1 project

“The beauty of Dataquest is that it starts at the most basic level, so a true beginner can understand the concepts”

Aaron Melton

Business Analyst @Aditi Consulting

Building a Data Pipeline

Enroll For Free

Course overview

In this course, you’ll learn how to build a simple data pipeline using imperative and functional paradigms. You’ll also learn how to use functional closures in Python, how to implement a well-designed pipeline API, how to write decorators, and how to apply them to functions.
At the end of the course, you’ll work on a real-world project, using a data pipeline to summarize Hacker News data. This project is a chance for you to combine the skills you learned in this course and build a real-world data pipeline from raw data to summarization.

Key skills

Writing a robust pipeline with a scheduler in Python
Using advanced Python concepts like closures, decorators, and more

Course outline

Building a Data Pipeline [5 lessons]

Functional Programming 2h

Lesson Objectives

Identify the differences between imperative and functional programming
Write code in Python in a functional style using map, reduce, and filter

Pipeline Tasks 2h

Lesson Objectives

Build task functions in a pipeline
Write functions using the functional paradigm

Building a Pipeline Class 1h

Lesson Objectives

Implement functional closures in Python
Apply decorators to functions
Implement a pipeline API

Multiple Dependency Pipeline 1h

Lesson Objectives

Define graph theory
Implement a directed acyclic graph in Python
Write a scheduler for the pipeline class

Guided Project: Hacker News Pipeline 1h

Lesson Objectives

Process JSON API data in Python.
Expand your portfolio by building a data pipeline

Projects in this course

Hacker News Pipeline

For this project, we’ll step into the role of data engineers to process Hacker News posts using Python. We’ll apply skills in JSON parsing, string cleaning, and building data pipelines.

View Project

The Dataquest guarantee

Dataquest has helped thousands of people start new careers in data. If you put in the work and follow our path, you’ll master data skills and grow your career.

We believe so strongly in our paths that we offer a full satisfaction guarantee. If you complete a career path on Dataquest and aren’t satisfied with your outcome, we’ll give you a refund.

Master skills faster with Dataquest

Go from zero to job-ready

Learn exactly what you need to achieve your goal. Don’t waste time on unrelated lessons.

Build your project portfolio

Build confidence with our in-depth projects, and show off your data skills.

Challenge yourself with exercises

Work with real data from day one with interactive lessons and hands-on exercises.

Showcase your path certification

Share the evidence of your hard work with your network and potential employers.

Grow your career with
Dataquest.

98%

of learners recommend

Dataquest for career advancement

4.85

Dataquest rating

SwitchUp Best Bootcamps

$30k

Average salary boost

for learners who complete a path