COURSE

Optimizing Code Performance On Large Datasets

In our Optimizing Code Performance On Large Datasets course, you’ll learn how to improve the performance of your code by optimizing CPU and I/O performance, and learn how to parallelize your code for improved performance.

You'll learn concepts such as CPU and I/O bounds and how they limit your code performance. You'll also practice analyzing data and parallel and how multithreading can help you overcome the limits of CPU and I/O bounds.

Then you'll learn the difference between a process and a thread and why the Python GIL is. You'll also be exposed to a multiprocessing library in Python and analyzing a dataset of movie quotes while analyzing data with parallel processing.

At the end of the course, you'll complete a project using threads and processes to optimize code and analyze Wikipedia pages more quickly. This project is a chance for you to combine the skills you learned in this course and use parallel processing to analyze pages on the web. It would also make a great portfolio project to show off your data engineering and parallel processing skills.

By the end of this course, you'll be able to:

  • Understand how CPU and I/O bounds limit your code performance.
  • Understand how multithreading can help you overcome the limits of CPU and I/O bounds.
  • Practice analyzing data in parallel.

START LEARNING

60+ FREE MISSIONS

By creating an account you agree to accept our terms of use and privacy policy.

Learn how to Optimize Code Performance

CPU Bound Programs

Learn how to process data more quickly by being aware of CPU bounds.

I/O Bound Programs

Learn the difference between processes and threads, and when to use processes.

Overcoming the Limitations of Threads

Learn the difference between processes and threads and when to use processes.

Quickly Analyzing Data with Parallel Processing

Learn how to combine processes and threads to quickly analyze a dataset of movie quotes.

Analyzing Wikipedia Pages

Use threads and processes to analyze Wikipedia pages more quickly.