Introduction to pandas
In the lessons on NumPy, you explored how the NumPy library makes working with data easier. Because you can easily work across multiple dimensions, your code is a lot easier to understand. By using vectorized operations instead of loops, your code runs faster with larger data.
NumPy provides fundamental structures and tools that make working with data easier, but there are several things that limit its usefulness, which is why data scientists often use a different library called pandas that is, in many ways, an extension of NumPy.
The underlying code for pandas uses the NumPy library extensively, which means the concepts you’ve been learning will come in handy as you begin to learn more about pandas. But pandas has some powerful features that make it particularly great for data analysis.
In this introductory pandas lesson, you’ll learn about and work with a data structure known as a pandas DataFrame. A DataFrame is the pandas equivalent of a Numpy 2D array with some differences: DataFrame can contain columns with multiple data types, and axis labels can have string values. You will also be learning about pandas Series, finding the dimensions of a DataFrame using the shape method, and learning how to select items from a Series.
As you learn pandas, you’ll work with a data set from Fortune magazine’s 2017 Global 500 list, which ranks the top 500 corporations worldwide by revenue.
- Learn how NumPy and pandas work together.
- Learn how to select and assign data in pandas using index labels.
- How to use pandas to analyze data.
- Understanding pandas and NumPy
- Introducing DataFrames
- Selecting Columns From a DataFrame by Label
- Column selection shortcuts
- Selecting Items from a Series by Label
- Selecting Rows From a DataFrame by Label
- Series and Dataframe Describe Methods
- More Data Exploration Methods
- Assignment with pandas
- Using Boolean Indexing with pandas Objects
- Using Boolean Arrays to Assign Values
- Challenge: Top Performers by Country
- Next Steps