Processing Large Datasets in Pandas

Learn how to work with medium-sized datasets by optimizing your pandas workflow, processing data in batches, and augmenting pandas with SQLite.

By the end of this course, you'll be able to:

  • Learn how to reduce the memory footprint of a pandas DataFrame.
  • Explore how to process large DataFrame in chunks and using SQLite.

Course Info:

The average completion time for this course is 10-hours.

This course requires a premium subscription and includes 2 free missions, 2 paid missions, and 1 guided project.  It is the 3rd course in the Data Engineer path.


Optimizing a DataFrame Memory Footprint

Learn how to reduce a DataFrame's memory footprint by selecting the correct data types.

Processing DataFrame in Chunks

Learn how to break a problem down into DataFrame chunks.

Practice Optimizing DataFrames and Processing in Chunks

Practice optimizing DataFrame types and working in chunks.

Augmenting Pandas with SQLite

Analyzing Startup Fundraising Deals from Crunchbase

Practice analyzing data using the pandas SQLite workflow.