Analyzing NYC High School Data

Over the last three lessons, you explored relationships between SAT scores and demographic factors in New York City public schools by combining multiple datasets into a single, clean pandas dataframe.

In the last lesson, you began performing some analysis, and you’ll extend that analysis in this guided project. In this guided project, you’ll work the data that you have worked within this course so far and play the role of a data analyst and act on your own accord to determine which demographic factors such as race, income, gender, etc. are influential to a person’s performance on the SAT.

Working on guided projects gives you hands-on experience with real-world examples, which also means they’ll be more challenging than lessons. However, keep in mind that now you have more tools you can use to clean and transform data than you did at the beginning of this data ckeaning project walkthrough course course.

As with all guided projects, we encourage you to experiment and extend your project, taking it in unique directions to make it a more compelling addition to your portfolio!

We also recommend creating a GitHub repository and placing this project there. It will help other people see your work, including employers. As you start to put multiple projects on GitHub, you’ll have the beginnings of a strong portfolio.


  • Learn to generate scatter plots to compare colums.
  • Learn to generate maps using the basemap library.

Lesson Outline

  1. Introduction
  2. Exploring Safety and SAT Scores
  3. Exploring Race and SAT Scores
  4. Exploring Gender and SAT Scores
  5. Exploring AP Scores vs. SAT Scores
  6. Next Steps

Get started for free

No credit card required.

Or With

By creating an account you agree to accept our terms of use and privacy policy.