Data Cleaning With R

Learn to simplify data frames, change data types, and create new variables.


  • Learn to identify data cleaning needs prior to analysis.
  • Learn to simplify data frames to contain only necessary variables and observations.
  • Learn to change data types of multiple variables at once.
  • Learn to create new variables by calculating summary statistics from existing variables.
  • Learn to use functionals to check for duplicated observations.

Mission Outline

1. The Importance of Data Cleaning
2. Cleaning the New York City Schools Data
3. SAT Data: Changing Data Types and Creating New Variables
4. AP Exam Data: Changing Data Types and Creating a New Variable
5. Class Size Data: Simplifying the Data Frame
6. Class Size Data: Calculating School Averages
7. Class Size Data: Creating a Key Using String Manipulation
8. Graduation Data: Simplifying the Data Frame
9. Demographics Data: Simplifying the Data Frame
10. High School Directory: Simplifying the Data Frame
11. Confirm that Data Frames are Prepared for Joining
12. Removing Duplicate Rows
13. Next Steps
14. Takeaways


Course Info:


The median completion time for this course is 8.05 hours. View details

This course is free, and includes four missions and one guided project. It is the fourth course in the Data Analyst in R path.


Take a Look Inside