Exploring Data with Pandas: Fundamentals

In the last mission, you learned the basics of the pandas library. We explored the primary data structure in pandas, the DataFrame, and learned some of the ways pandas makes working with data easier than NumPy:

  • Axis values in DataFrames can have string labels, not just numeric ones, which makes selecting data much easier.
  • DataFrames can contain columns with multiple data types: including integer, float, and string.

In this mission, you'll learn another way pandas makes works with data easier. It has many built-in methods such as .describe(), .iloc[], .loc[], as well as functions for common exploration and analysis tasks. As you learn these, you'll also explore how pandas uses many of the concepts we learned in the NumPy missions, including vectorized operations and boolean indexing.

As in the previous mission, you’ll be working with a data set from Fortune magazine's Global 500 list 2017, which ranks the top 500 corporations worldwide by revenue. At the end of the mission, you will calculate a specific statistic for each of the three most common countries in the data set.

As with every mission at Dataquest, you'll be given an opportunity to practice each concept using our code editor with built-in answer checking to ensure that you've mastered a concept before moving on to this next.


  • How to use common methods for exploring data.
  • How to assign data in pandas using index labels.
  • How to use pandas to analyze data.

Mission Outline

1.  Introduction to the Data
2. Vectorized Operations
3. Series Data Exploration Methods
4. Series Describe Method
5. Method Chaining
6. Dataframe Exploration Methods
7. Dataframe Describe Method
8. Assignment with pandas
9. Using Boolean Indexing with pandas Objects
10. Using Boolean Arrays to Assign Values
11. Creating New Columns
12. Challenge: Top Performers by Country
13. Next Steps
14. Takeaways


Course Info:


The median completion time for this course is 6.77 hours. View Details

This course includes five missions and one guided project.  It is the third course in the Data Analyst in Python path and Data Scientist in Python path.


Take a Look Inside

(function(d) { d.addEventListener("DOMContentLoaded", function() { var pathname = d.location.pathname.replace(/^[/]|[/]$/g, "").replace("/", "-"); var tags = d.getElementsByTagName("iframe"); var type = pathname.startsWith("course") ? "?course=" : pathname.startsWith("path") ? "?path=" : null; if (type) { var i; for (i = 0; i < tags.length; i++) { if (tags[i].src.indexOf("signup#iframe") !== -1) { tags[i].src = tags[i].src.replace("#iframe", "") + type + pathname + "#iframe"; } } } }, false); })(document);