Guided Project: NYC Schools Perceptions

In this data cleaning project, you will put together everything you've learned in this course to practice your data cleaning and analysis skills and learn to use R notebooks as you explore survey data. Building on the skills you’ve learned so far in our R path, you'll create a data science portfolio project that you can show to employers to demonstrate proficiency in both data cleaning in R and data visualization to ensure you're sufficiently prepared to land your first job in data!

In this data cleaning project, you will use data visualization and correlation analysis to explore parent, student, and teacher perceptions of New York City schools. As you start this project, you will import, simplify, and reshape a large dataset. Once you import the data and create a single dataframe for your analysis, you will be looking for interesting correlations and examine relationships using scatter plots as well as examine the differences in student, parent, and teachers' perceptions of the schools to identify possible factors that influence average SAT scores.

While completing this data cleaning project, you will continue to work with demographic and test score data from the New York City Department of Education. You'll also incorporate some additional data into your analysis: responses to surveys designed to gauge parent, student, and teacher perceptions of the quality of New York City schools. 

To complete this project, we highly recommend downloading R and work with R Notebooks to work and follow along with the instructions.


  • Learn about R Notebooks and how you can use them to showcase your work.
  • Learn to import, simplify, and reshape a large survey data set.
  • Learn to interpret metadata to guide your data cleaning decisions.
  • Use data visualization and correlation analysis to explore parents', students', and teachers' perceiptions of NYC schools.

Mission Outline

1. Cleaning and Analyzing Data: Show Off Your Skills and Start Building a Portfolio
2. Introducing R Notebooks: Share Your Projects With the World!
3. New York City Schools Survey Data
4. Simplifying the Data Frames
5. Creating a Single Data Frame for Analysis
6. Look for Interesting Correlations and Examine Relationships Using Scatter Plots
7. Differences in Student, Parent, and Teacher Perceptions: Reshape the Data
8. Next Steps


Course Info:


The median completion time for this course is 8.05 hours. View details

This course includes four missions and one guided project. It is the fourth course in the Data Analyst in R path.


Take a Look Inside

(function(d) { d.addEventListener("DOMContentLoaded", function() { var pathname = d.location.pathname.replace(/^[/]|[/]$/g, "").replace("/", "-"); var tags = d.getElementsByTagName("iframe"); var type = pathname.startsWith("course") ? "?course=" : pathname.startsWith("path") ? "?path=" : null; if (type) { var i; for (i = 0; i < tags.length; i++) { if (tags[i].src.indexOf("signup#iframe") !== -1) { tags[i].src = tags[i].src.replace("#iframe", "") + type + pathname + "#iframe"; } } } }, false); })(document);