Working with Missing Data

In our Working with Missing Data mission, you will learn to identify and deal with missing and incorrect data. More specifically, you will learn how to use R to identify missing data, and you’ll learn to decide how best to correct the data and perform the required cleaning.

In addition to learning how to identify missing data using R code, you will also learn how to identify missing data using visualizations made using ggplot2.

Once you’ve grasped identifying missing data, you’ll learn different techniques for filling in the holes in your dataset. You’ll try out various statistical methods that can be used for imputation, and you’ll also learn to fill in the gaps by supplementing your data set with external data. Accounting for duplicate or missing data is a common task data scientists face when analyzing data drawn from multiple systems, so it’s a crucial data cleaning skill to understand.

In this mission, you will be working to identify and fill in the gaps in real NYPD Motor Vehicle Collisions data. Thinking like a data scientist, you’ll explore the data set and practice your new skills and decision-making to turn this messy data into something that’s ready for real analysis.


  • Identify missing data using both code and visualization.
  • Replace missing data using imputation.
  • Fill in missing values by using external data.

Lesson Outline

  1. Introduction
  2. Summing Values over Rows
  3. Verifying the Total Columns
  4. Filling and Verifying the Killed and Injured Data
  5. Preparing Data for Missing Data Visualization
  6. Visualizating Missing Data with Heatmaps
  7. Visualizing Correlation Matrix with Heatmaps
  8. Analyzing Correlations in Missing Data
  9. Finding the Most Common Values Across Multiple Columns
  10. Filling Unknown Values with a Placeholder
  11. Missing Data in the “Location” Columns
  12. Next Steps
  13. Takeaways

Get started for free

No credit card required.

Or With

By creating an account you agree to accept our terms of use and privacy policy.