Data Cleaning Walkthrough: Analyzing and Visualizing the Data

In this lesson, we’ll discover correlations, create plots, and then make maps. The first thing we’ll do is find any correlations between any of the columns and sat_score. This will help us determine which columns might be interesting to plot out or investigate further. Afterward, we’ll perform more analysis and make maps using the columns we’ve identified.

At many points in your career, you’ll need to be able to build complete, end-to-end data science projects on your own. Data science projects usually consist of one of two things:

  • An exploration and analysis of a set of data. One example might involve analyzing donors to political campaigns, creating a plot, and then sharing an analysis of the plot with others.
  • An operational system that generates predictions based on data that updates continually. For example, an algorithm that pulls in daily stock ticker data and predicts which stock prices will rise and fall.

For this particular end-to-end data science project, we began investigating possible relationships between SAT scores and demographics. In order to do this, we acquired several data sets containing information about New York City public schools. We cleaned them, then combined them into a single data set named combined that we’re now ready to analyze and visualize.

You’ll find the ability to create data science projects useful in several different contexts:

  • Projects will help you build a portfolio, which is critical to finding a job as a data analyst or scientist.
  • Working on projects will help you learn new skills and reinforce existing concepts.
  • Most “real-world” data science and analysis work consisting of developing internal projects.
  • Projects allow you to investigate interesting phenomena and satisfy your curiosity.


  • Learn to compute correlations in pandas.
  • Learn to map schools using basemap.

Lesson Outline

  1. Introduction
  2. Finding Correlations With the r Value
  3. Finding Correlations With the r Value
  4. Plotting Enrollment With the Plot() Accessor
  5. Plotting Enrollment With the Plot() Accessor
  6. Exploring Schools With Low SAT Scores and Enrollment
  7. Plotting Language Learning Percentage
  8. Mapping the Schools With Basemap
  9. Mapping the Schools With Basemap
  10. Plotting Out Statistics
  11. Calculating District-Level Statistics
  12. Plotting Percent Of English Learners by District
  13. Next Steps
  14. Takeways

Get started for free

No credit card required.

Or With

By creating an account you agree to accept our terms of use and privacy policy.