Data Cleaning With R

Learn to simplify data frames, change data types, and create new variables.


  • Learn to identify data cleaning needs prior to analysis.
  • Learn to simplify data frames to contain only necessary variables and observations.
  • Learn to change data types of multiple variables at once.
  • Learn to create new variables by calculating summary statistics from existing variables.
  • Learn to use functionals to check for duplicated observations.

Mission Outline

1. The Importance of Data Cleaning
2. Cleaning the New York City Schools Data
3. SAT Data: Changing Data Types and Creating New Variables
4. AP Exam Data: Changing Data Types and Creating a New Variable
5. Class Size Data: Simplifying the Data Frame
6. Class Size Data: Calculating School Averages
7. Class Size Data: Creating a Key Using String Manipulation
8. Graduation Data: Simplifying the Data Frame
9. Demographics Data: Simplifying the Data Frame
10. High School Directory: Simplifying the Data Frame
11. Confirm that Data Frames are Prepared for Joining
12. Removing Duplicate Rows
13. Next Steps
14. Takeaways

Course Info:

Data Cleaning in R


The average completion time for this course is 10-hours.

This course is free. This course includes 4 missions and 1 guided project. It is the fourth course in the Data Analyst in R path.


Take a Look Inside

Share On Facebook
Share On Twitter
Share On Linkedin
Share On Reddit