MISSION 171

Quickly Analyzing Data With Parallel Processing

Learn how to combine processes and threads to quickly analyze a dataset of movie quotes.


Objectives

  • How to choose which library to use when parallel computing.
  • Learn the why and how to use process pools.
  • How to debug parallel processing code.

Mission Outline

1. Movie Quotes Data
2. The Concurrent Futures Package
3. Reading In Files
4. Finding The Longest Lines
5. Finding The Most Commonly Used Word
6. Debugging Errors
7. Debugging Errors
8. Removing Punctuation
9. Finding Word Frequencies
10. Next Steps
11. Takeaways

improving-code-performance

Course Info:

Optimizing Code Performance On Large Datasets

Intermediate

The average completion time for this course is 10-hours.

This course requires a premium subscription and includes four missions, and one guided project.  It is the 4th course in the Data Engineer path.

START LEARNING FREE

Take a Look Inside