February 25, 2022

Tutorial: Reset Index in Pandas

In this tutorial, we'll discuss the reset_index() pandas method, why we may need to reset the index of a DataFrame in pandas, and how we can apply and tune this method. We'll also consider a small use case of resetting the DataFrame index after dropping missing values.

To practice DataFrame index resetting, we'll use a sample of a Kaggle dataset on Animal Shelter Analytics.

What is Reset_Index() in Pandas?

If we read a csv file using the read_csv() pandas method without specifying any index, the resulting DataFrame will have a default integer-based index starting from 0 for the first row and increasing by 1 for each subsequent row:

import pandas as pd
import numpy as np

df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head()
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

In some cases, we may want to have more meaningful row labels, so we'll select one of the columns of the DataFrame to be the DataFrame index. We can do it directly when applying the read_csv() pandas method using the index_col parameter:

df = pd.read_csv('Austin_Animal_Center_Intakes.csv', index_col='Animal ID').head()
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Alternatively, we can use the set_index() method to set any column of a DataFrame as a DataFrame index:

df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head()
df.set_index('Animal ID', inplace=True)
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

What if, at some point, we need to restore the default numeric index? This is where the reset_index() pandas method comes in:

df.reset_index()
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

The default behavior of this method includes replacing the existing DataFrame index with the default integer-based one and converting the old index into a new column with the same name as the old index (or with the name index, if it didn't have any name). Also, by default, the reset_index() method removes all levels from a MultiIndex (when it is the case, as we will see later) and doesn't affect the original DataFrame creating, instead, a new one.

When to Use the Reset_Index() Method

The reset_index() pandas method resets the DataFrame index to the default numeric index, and it is particularly helpful in the following cases:

  • When performing data wrangling — in particular the preprocessing operations such as filtering data or dropping missing values, resulting in a smaller DataFrame with the numeric index that is no longer continuous  (we'll explore a use case at the end of this tutorial)
  • When the index is supposed to be treated as a common DataFrame column
  • When the index labels don't provide any valuable information about the data

How to Tune the Reset_Index() Method

Earlier, we saw how the reset_index() pandas method works when we don't pass any arguments to it. If necessary, we can change this default behavior by tuning various parameters of the method. Let's look at the most useful ones: level, drop, and inplace.

level

This parameter takes integer, string, tuple, or list as possible data types, and is applicable only for the DataFrames with a MultiIndex, like this one:

df_multiindex = pd.read_csv('Austin_Animal_Center_Intakes.csv', index_col=['Animal ID', 'Name']).head()
df_multiindex
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Indeed, if now we check the index of the DataFrame above, we'll see that it isn't a common DataFrame index but a MultiIndex object:

df_multiindex.index
MultiIndex([('A786884',  '*Brock'),
            ('A706918',   'Belle'),
            ('A724273', 'Runster'),
            ('A665644',       nan),
            ('A682524',     'Rio')],
           names=['Animal ID', 'Name'])

By default, the parameter level of the reset_index() pandas method (level=None) removes all levels of a MultiIndex:

df_multiindex.reset_index()
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

We see that both indices of our DataFrame were converted into common DataFrame columns, while the index was reset to the default integer-based one.

If, instead, we explicitly pass the value for level, this parameter removes the selected levels from the DataFrame index and returns them as common DataFrame columns (unless we opt for dropping this information completely from the DataFrame using the drop parameter). Compare the following operations:

df_multiindex.reset_index(level='Animal ID')
Name Animal ID DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
*Brock A786884 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
Belle A706918 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
Runster A724273 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
NaN A665644 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
Rio A682524 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Initially, Animal ID was one of the indices of the DataFrame. After setting the level parameter, it was removed from the index and inserted as a common column called Animal ID.

df_multiindex.reset_index(level='Name')
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Here, Name was initially one of the indices of the DataFrame. After setting the level parameter, it became a common column called Name.

drop

This parameter determines whether to keep the old index as a common DataFrame column after index resetting or drop it completely from the DataFrame. By default (drop=False) keeps it, as we have seen in all the previous examples. Otherwise, if we don't want to keep the old index as a column, we can remove it entirely from the DataFrame after index resetting (drop=True):

df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray
df.reset_index(drop=True)
Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

In the DataFrame above, the information contained in the old index was completely removed from the DataFrame.

The drop parameter works also for the DataFrames with a MultiIndex, like the one we created earlier:

df_multiindex
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray
df_multiindex.reset_index(drop=True)
DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Both of the old indices were completely removed from the Dataframe, and the index was reset to default.

Of course, we can combine the drop and level parameters, specifying which of the old indices to remove entirely from the DataFrame:

df_multiindex.reset_index(level='Animal ID', drop=True)
DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
Name
*Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

The old index Animal ID was removed both from the index and DataFrame itself. The other index, Name, was kept as the current index of the DataFrame.

inplace

This parameter determines whether to modify the original DataFrame directly or create a new DataFrame object. By default, it creates a new DataFrame with the new index (inplace=False) and leaves the original DataFrame unchanged. Indeed, let's run again the reset_index() method with the default parameters and then compare the result with the original DataFrame:

df.reset_index()
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Even though we reset the index to the default numeric one running the first piece of code, the original DataFrame df remained the same. If we need to re-assign the original DataFrame to the result of applying the reset_index() method on it, we can either re-assign it directly (df = df.reset_index()) or pass the parameter inplace=True to this method:

df.reset_index(inplace=True)
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

We see that now the changes have been applied directly to the original DataFrame.

Use Case: Resetting Index after Dropping Missing Values

Let's put everything we have discussed so far in practice and see how resetting the DataFrame index can be useful when we drop missing values from the DataFrame.

First, let's restore the very first DataFrame that we created at the beginning of this tutorial, the one with the default numeric index:

df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head()
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

We see that there is a missing value in the DataFrame. Let's drop the entire row with the missing value using the dropna() method:

df.dropna(inplace=True)
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

The row was removed from the DataFrame. However, the index is no longer continuous: 0, 1, 2, 4. Let's reset it:

df.reset_index()
index Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Now the index is continuous; however, since we didn't explicitly pass the drop parameter, the old index was converted into a column, with the default name index. Let's drop the old index completely from the DataFrame:

df.reset_index(drop=True)
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

We got rid of the meaningless old index completely, and the current index is continuous now. The last step is to save these modifications to our original DataFrame using the inplace parameter:

df.reset_index(drop=True, inplace=True)
df
Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color
0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor
1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver
2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White
3 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray

Conclusion

All in all, we considered the reset_index() pandas method from many sides. We learned the following:

  • The default behavior of the reset_index() pandas method
  • How to restore the default numeric index of a DataFrame
  • When to use the reset_index() pandas method
  • The most important parameters of the method
  • How to work with MultiIndex
  • How to drop the old index completely from a DataFrame
  • How to save the modifications directly to the original DataFrame

In addition, we explored a use case resetting the DataFrame index after dropping missing values.

Elena Kosourova

About the author

Elena Kosourova

Elena is a petroleum geologist and community manager at Dataquest. You can find her chatting online with data enthusiasts and writing tutorials on data science topics. Find her on LinkedIn.

Learn AI & data skills 10x faster

Headshot Headshot

Join 1M+ learners

Enroll for free