Tutorial: Reset Index in Pandas
In this tutorial, we'll discuss the reset_index()
pandas method, why we may need to reset the index of a DataFrame in pandas, and how we can apply and tune this method. We'll also consider a small use case of resetting the DataFrame index after dropping missing values.
To practice DataFrame index resetting, we'll use a sample of a Kaggle dataset on Animal Shelter Analytics.
What is Reset_Index()
in Pandas?
If we read a csv file using the read_csv()
pandas method without specifying any index, the resulting DataFrame will have a default integer-based index starting from 0 for the first row and increasing by 1 for each subsequent row:
import pandas as pd
import numpy as np
df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head()
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
In some cases, we may want to have more meaningful row labels, so we'll select one of the columns of the DataFrame to be the DataFrame index. We can do it directly when applying the read_csv()
pandas method using the index_col
parameter:
df = pd.read_csv('Austin_Animal_Center_Intakes.csv', index_col='Animal ID').head()
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Alternatively, we can use the set_index()
method to set any column of a DataFrame as a DataFrame index:
df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head()
df.set_index('Animal ID', inplace=True)
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
What if, at some point, we need to restore the default numeric index? This is where the reset_index()
pandas method comes in:
df.reset_index()
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
The default behavior of this method includes replacing the existing DataFrame index with the default integer-based one and converting the old index into a new column with the same name as the old index (or with the name index
, if it didn't have any name). Also, by default, the reset_index()
method removes all levels from a MultiIndex
(when it is the case, as we will see later) and doesn't affect the original DataFrame creating, instead, a new one.
When to Use the Reset_Index()
Method
The reset_index()
pandas method resets the DataFrame index to the default numeric index, and it is particularly helpful in the following cases:
- When performing data wrangling — in particular the preprocessing operations such as filtering data or dropping missing values, resulting in a smaller DataFrame with the numeric index that is no longer continuous (we'll explore a use case at the end of this tutorial)
- When the index is supposed to be treated as a common DataFrame column
- When the index labels don't provide any valuable information about the data
How to Tune the Reset_Index()
Method
Earlier, we saw how the reset_index()
pandas method works when we don't pass any arguments to it. If necessary, we can change this default behavior by tuning various parameters of the method. Let's look at the most useful ones: level
, drop
, and inplace
.
level
This parameter takes integer, string, tuple, or list as possible data types, and is applicable only for the DataFrames with a MultiIndex
, like this one:
df_multiindex = pd.read_csv('Austin_Animal_Center_Intakes.csv', index_col=['Animal ID', 'Name']).head()
df_multiindex
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Indeed, if now we check the index of the DataFrame above, we'll see that it isn't a common DataFrame index but a MultiIndex
object:
df_multiindex.index
MultiIndex([('A786884', '*Brock'),
('A706918', 'Belle'),
('A724273', 'Runster'),
('A665644', nan),
('A682524', 'Rio')],
names=['Animal ID', 'Name'])
By default, the parameter level
of the reset_index()
pandas method (level=None
) removes all levels of a MultiIndex
:
df_multiindex.reset_index()
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
We see that both indices of our DataFrame were converted into common DataFrame columns, while the index was reset to the default integer-based one.
If, instead, we explicitly pass the value for level
, this parameter removes the selected levels from the DataFrame index and returns them as common DataFrame columns (unless we opt for dropping this information completely from the DataFrame using the drop
parameter). Compare the following operations:
df_multiindex.reset_index(level='Animal ID')
Name | Animal ID | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
*Brock | A786884 | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
Belle | A706918 | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
Runster | A724273 | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
NaN | A665644 | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
Rio | A682524 | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Initially, Animal ID
was one of the indices of the DataFrame. After setting the level
parameter, it was removed from the index and inserted as a common column called Animal ID
.
df_multiindex.reset_index(level='Name')
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Here, Name
was initially one of the indices of the DataFrame. After setting the level
parameter, it became a common column called Name
.
drop
This parameter determines whether to keep the old index as a common DataFrame column after index resetting or drop it completely from the DataFrame. By default (drop=False
) keeps it, as we have seen in all the previous examples. Otherwise, if we don't want to keep the old index as a column, we can remove it entirely from the DataFrame after index resetting (drop=True
):
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
df.reset_index(drop=True)
Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
In the DataFrame above, the information contained in the old index was completely removed from the DataFrame.
The drop
parameter works also for the DataFrames with a MultiIndex
, like the one we created earlier:
df_multiindex
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
df_multiindex.reset_index(drop=True)
DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Both of the old indices were completely removed from the Dataframe, and the index was reset to default.
Of course, we can combine the drop
and level
parameters, specifying which of the old indices to remove entirely from the DataFrame:
df_multiindex.reset_index(level='Animal ID', drop=True)
DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|
Name | ||||||||||
*Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
The old index Animal ID
was removed both from the index and DataFrame itself. The other index, Name
, was kept as the current index of the DataFrame.
inplace
This parameter determines whether to modify the original DataFrame directly or create a new DataFrame object. By default, it creates a new DataFrame with the new index (inplace=False
) and leaves the original DataFrame unchanged. Indeed, let's run again the reset_index()
method with the default parameters and then compare the result with the original DataFrame:
df.reset_index()
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color |
---|---|---|---|---|---|---|---|---|---|---|---|
A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Even though we reset the index to the default numeric one running the first piece of code, the original DataFrame df
remained the same. If we need to re-assign the original DataFrame to the result of applying the reset_index()
method on it, we can either re-assign it directly (df = df.reset_index()
) or pass the parameter inplace=True
to this method:
df.reset_index(inplace=True)
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
We see that now the changes have been applied directly to the original DataFrame.
Use Case: Resetting Index after Dropping Missing Values
Let's put everything we have discussed so far in practice and see how resetting the DataFrame index can be useful when we drop missing values from the DataFrame.
First, let's restore the very first DataFrame that we created at the beginning of this tutorial, the one with the default numeric index:
df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head()
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A665644 | NaN | 10/21/2013 07:59:00 AM | 10/21/2013 07:59:00 AM | Austin (TX) | Stray | Sick | Cat | Intact Female | 4 weeks | Domestic Shorthair Mix | Calico |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
We see that there is a missing value in the DataFrame. Let's drop the entire row with the missing value using the dropna()
method:
df.dropna(inplace=True)
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
The row was removed from the DataFrame. However, the index is no longer continuous: 0, 1, 2, 4. Let's reset it:
df.reset_index()
index | Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | 1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | 2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | 4 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Now the index is continuous; however, since we didn't explicitly pass the drop
parameter, the old index was converted into a column, with the default name index
. Let's drop the old index completely from the DataFrame:
df.reset_index(drop=True)
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
We got rid of the meaningless old index completely, and the current index is continuous now. The last step is to save these modifications to our original DataFrame using the inplace
parameter:
df.reset_index(drop=True, inplace=True)
df
Animal ID | Name | DateTime | MonthYear | Found Location | Intake Type | Intake Condition | Animal Type | Sex upon Intake | Age upon Intake | Breed | Color | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A786884 | *Brock | 01/03/2019 04:19:00 PM | 01/03/2019 04:19:00 PM | 2501 Magin Meadow Dr in Austin (TX) | Stray | Normal | Dog | Neutered Male | 2 years | Beagle Mix | Tricolor |
1 | A706918 | Belle | 07/05/2015 12:59:00 PM | 07/05/2015 12:59:00 PM | 9409 Bluegrass Dr in Austin (TX) | Stray | Normal | Dog | Spayed Female | 8 years | English Springer Spaniel | White/Liver |
2 | A724273 | Runster | 04/14/2016 06:43:00 PM | 04/14/2016 06:43:00 PM | 2818 Palomino Trail in Austin (TX) | Stray | Normal | Dog | Intact Male | 11 months | Basenji Mix | Sable/White |
3 | A682524 | Rio | 06/29/2014 10:38:00 AM | 06/29/2014 10:38:00 AM | 800 Grove Blvd in Austin (TX) | Stray | Normal | Dog | Neutered Male | 4 years | Doberman Pinsch/Australian Cattle Dog | Tan/Gray |
Conclusion
All in all, we considered the reset_index()
pandas method from many sides. We learned the following:
- The default behavior of the
reset_index()
pandas method - How to restore the default numeric index of a DataFrame
- When to use the
reset_index()
pandas method - The most important parameters of the method
- How to work with
MultiIndex
- How to drop the old index completely from a DataFrame
- How to save the modifications directly to the original DataFrame
In addition, we explored a use case resetting the DataFrame index after dropping missing values.