Exploring Data with Pandas: Fundamentals
In the last mission, you learned the basics of the pandas library. We explored the primary data structure in pandas, the DataFrame, and learned some of the ways pandas makes working with data easier than NumPy:
- Axis values in DataFrames can have string labels, not just numeric ones, which makes selecting data much easier.
- DataFrames can contain columns with multiple data types: including integer, float, and string.
In this mission, you'll learn another way pandas makes works with data easier. It has many built-in methods such as
.loc, as well as functions for common exploration and analysis tasks. As you learn these, you'll also explore how pandas uses many of the concepts we learned in the NumPy missions, including vectorized operations and boolean indexing.
As in the previous mission, you’ll be working with a data set from Fortune magazine's Global 500 list 2017, which ranks the top 500 corporations worldwide by revenue. At the end of the mission, you will calculate a specific statistic for each of the three most common countries in the data set.
As with every mission at Dataquest, you'll be given an opportunity to practice each concept using our code editor with built-in answer checking to ensure that you've mastered a concept before moving on to this next.
1. Introduction to the Data
2. Vectorized Operations
3. Series Data Exploration Methods
4. Series Describe Method
5. Method Chaining
6. Dataframe Exploration Methods
7. Dataframe Describe Method
8. Assignment with pandas
9. Using Boolean Indexing with pandas Objects
10. Using Boolean Arrays to Assign Values
11. Creating New Columns
12. Challenge: Top Performers by Country
13. Next Steps