Working with Missing Data

In our Working with Missing Data mission, you will learn to identify and deal with missing and incorrect data. More specifically, you will learn how to use Python and pandas to identify missing data, decide how best to correct the data and perform the required cleaning. You’ll learn how to identify missing data using Python and pandas code, as well as how to identify missing data using visualization using matplotlib and seaborn, two powerful visualization libraries.

Additionally, you will learn to fill in missing data using either imputation or by using external data. In many situations, a data analyst or data scientist will find themselves in a scenario where they’ll need to account for duplicate or missing data when analyzing data drawn from multiple systems.

In this mission, you will work with NYPD Motor Vehicle Collisions data to give a thorough overview of how to identify and fill in missing data. Because you’ll be working with real-world data, you will get the opportunity to think like a data analyst or data scientist as you explore a dataset. By the end of this mission, you will have a better working knowledge of regular expressions and how to use them to do some powerful string manipulation.


  • Identify missing data using both code and visualization.
  • Replace missing data using imputation.
  • Fill in missing values by using external data.

Lesson Outline

  1. Introduction
  2. Verifying the Total Columns
  3. Filling and Verifying the Killed and Injured Data
  4. Assigning the Corrected Data Back to the Main Dataframe
  5. Visualizing Missing Data with Plots
  6. Analyzing Correlations in Missing Data
  7. Finding the Most Common Values Across Multiple Columns
  8. Filling Unknown Values with a Placeholder
  9. Missing Data in the “Location” Columns
  10. Imputing Location Data
  11. Next Steps
  12. Takeaways

Get started for free

No credit card required.

Or With

By creating an account you agree to accept our terms of use and privacy policy.