In this data cleaning project, you will put together everything you've learned in this course to practice your data cleaning and analysis skills and learn to use R notebooks as you explore survey data. Building on the skills you’ve learned so far in our R path, you'll create a data science portfolio project that you can show to employers to demonstrate proficiency in both data cleaning in R and data visualization to ensure you're sufficiently prepared to land your first job in data!
In this data cleaning project, you will use data visualization and correlation analysis to explore parent, student, and teacher perceptions of New York City schools. As you start this project, you will import, simplify, and reshape a large dataset. Once you import the data and create a single dataframe for your analysis, you will be looking for interesting correlations and examine relationships using scatter plots as well as examine the differences in student, parent, and teachers' perceptions of the schools to identify possible factors that influence average SAT scores.
While completing this data cleaning project, you will continue to work with demographic and test score data from the New York City Department of Education. You'll also incorporate some additional data into your analysis: responses to surveys designed to gauge parent, student, and teacher perceptions of the quality of New York City schools.
To complete this project, we highly recommend downloading R and work with R Notebooks to work and follow along with the instructions.
1. Cleaning and Analyzing Data: Show Off Your Skills and Start Building a Portfolio
2. Introducing R Notebooks: Share Your Projects With the World!
3. New York City Schools Survey Data
4. Simplifying the Data Frames
5. Creating a Single Data Frame for Analysis
6. Look for Interesting Correlations and Examine Relationships Using Scatter Plots
7. Differences in Student, Parent, and Teacher Perceptions: Reshape the Data
8. Next Steps