# Tutorial: Why Functions Modify Lists and Dictionaries in Python

March 14, 2019

Python’s functions (both the built-in ones and custom functions we write ourselves) are crucial tools for working with data. But what they do with our data can be a little confusing, and if we’re not aware of what’s going on, it could cause serious errors in our analysis.

In this tutorial, we’re going to take a close look at how Python treats different data types when they’re being manipulated inside of functions, and learn how to ensure that our data is being changed only when we want it to be changed.

## Memory Isolation in Functions

To understand how Python handles global variables inside functions, let’s do a little experiment. We’ll create two global variables, `number_1` and `number_2`, and assign them to the integers `5` and `10`. Then, we’ll use those global variables as the arguments in a function that performs some simple math. We’ll also use the variable names as the function’s parameter names. Then, we’ll see whether all of the variable usage inside our function has affected the global value of these varaibles.

``````number_1 = 5
number_2 = 10

number_1 = number_1 * 10
number_2 = number_2 * 10
return number_1 + number_2

print(a_sum)
print(number_1)
print(number_2)``````
``````150
5
10
``````

As we can see above, the function worked correctly, and the values of the global variables `number_1` and `number_2` did not change, even though we used them as arguments and parameter names in our function. This is because Python stores variables from a function in a different memory location from global variables. They are isolated. Thus, the variable `number_1` can have one value (5) globally, and a different value (50) inside the function, where it is isolated.

(Incidentally, if you’re confused about the difference between parameters and arguments, Python’s documentation on the subject is quite helpful.)

## What About Lists and Dictionaries?

### Lists

We’ve seen that what we do to a variable like `number_1` above inside a function doesn’t affect its global value. But `number_1` is an integer, which is a pretty basic data type. What happens if we try the same experiment with a different data type, like a list? Below, we’ll create a function called `duplicate_last()` that will duplicate the final entry in any list we pass it as an argument.

``````initial_list = [1, 2, 3]

def duplicate_last(a_list):
last_element = a_list[-1]
a_list.append(last_element)
return a_list

new_list = duplicate_last(a_list = initial_list)
print(new_list)
print(initial_list)``````
``````[1, 2, 3, 3]
[1, 2, 3, 3]
``````

As we can see, here the global value of `initial_list` was updated, even though its value was only changed inside the function!

### Dictionaries

Now, let’s write a function that takes a dictionary as an argument to see if a global dictionary variable will be modified when it’s manipulated inside a function as well.

To make this a bit more realistic, we’ll be using data from the `AppleStore.csv` data set that’s used in our Python Fundamentals course (the data is available for download here).

In the snippet below, we’re starting with a dictionary that contains counts for the number of apps with each age rating in the dataset (so there are 4,433 apps rated “4+”, 987 apps rated “9+”, etc.). Let’s imagine we want to calculate a percentage for each age rating, so we can get a picture of which age ratings are the most common among apps in the App Store.

To do this, we’ll write a function called `make_percentages()` that will take a dictionary as an argument and convert the counts to percentages. We’ll need to start a count at zero and then iterate over each value in the dictionary, adding them to the count so we get the total number of ratings. Then we’ll to iterate over the dictionary again and do some math to each value to calculate the percentage.

``````content_ratings = {'4+': 4433, '9+': 987, '12+': 1155, '17+': 622}

def make_percentages(a_dictionary):
total = 0
for key in a_dictionary:
count = a_dictionary[key]
total += count

for key in a_dictionary:
a_dictionary[key] = (a_dictionary[key] / total) * 100

return a_dictionary``````

Before we look at the output, let’s quickly review what’s happening above. After assigning our dictionary of app age ratings to the variable `content_ratings`, we create a new function called `make_percentages()` that takes a single argument: `a_dictionary`.

To figure what percentage of apps fall into each age rating, we’ll need to know the total number of apps, so we first set a new variable called `total` to `0` and then loop through each key in `a_dictionary`, adding it to `total`.

Once that’s finished, all we need to do is loop through `a_dictionary` again, dividing each entry by the total and then multiplying the result by 100. This will give us a dictionary with percentages.

But what happens when we use our global `content_ratings` as the argument for this new function?

``````c_ratings_percentages = make_percentages(content_ratings)
print(c_ratings_percentages)
print(content_ratings)``````
``````{'4+': 61.595109073224954, '9+': 13.714047519799916, '12+': 16.04835348061692, '17+': 8.642489926358204}
{'4+': 61.595109073224954, '9+': 13.714047519799916, '12+': 16.04835348061692, '17+': 8.642489926358204}
``````

Just as we saw with lists, our global `content_ratings` variable has been changed, even though it was only modified inside of the `make_percentages()` function we created.

So what’s actually happening here? We’ve bumped up against the difference between mutable and immutable data types.

## Mutable and Immutable Data Types

In Python, data types can be either mutable (changeable) or immutable (unchangable). And while most of the data types we’ve worked with in introductory Python are immutable (including integers, floats, strings, Booleans, and tuples), lists and dictionaries are mutable. That means a global list or dictionary can be changed even when it’s used inside of a function, just like we saw in the examples above.

To understand the difference between mutable (changable) and immutable (unchangable), it’s helpful to look at how Python actually treats these variables.

Let’s start by considering a simple variable assignment:

``a = 5``

The variable name `a` acts like a pointer toward `5`, and it helps us retrieve `5` whenever we want. `5` is an integer, and integers are immutable data types. If a data type is immutable, it means it can’t be updated once it’s been created. If we do `a += 1`, we’re not actually updating `5` to `6`. In the animation below, we can see that:

• `a` initially points toward `5`.
• `a += 1` is run, and this moves the pointer from `5` to `6`, it doesn’t actually change the number `5`. Mutable data types like lists and dictionaries behave differently. They can be updated. So, for example, let’s make a very simple list:

``list_1 = [1, 2]``

If we append a `3` to the end of this list, we’re not simply pointing `list_1` toward a different list, we’re directly updating the existing list: Even if we create multiple list variables, as long as they point to the same list, they’ll all be updated when that list is changed, as we can see in the code below:

``````list_1 = [1, 2]
list_2 = list_1
list_1.append(3)
print(list_1)
print(list_2)``````
``````[1, 2, 3]
[1, 2, 3]
``````

Here’s an animated visualization of what’s actually happening in the code above: This explains why our global variables were changed when we were experimenting with lists and dictionaries earlier. Because lists and dictionaries are mutable, changing them (even inside a function) changes the list or dictionary itself, which isn’t the case for immutable data types.

## Keeping Mutable Data Types Unchanged

Generally speaking, we don’t want our functions to be changing global variables, even when they contain mutable data types like lists or dictionaries. That’s because in more complex analyses and programs, we might be using many different functions frequently. If all of them are changing the lists and dictionaries they’re working on, it can become quite difficult to keep track of what’s changing what.

Thankfully there’s an easy way to get around this: we can make a copy of the list or dictionary using a built-in Python method called `.copy()`.

If you haven’t learned about methods yet, don’t worry. They’re covered in our intermediate Python course, but for this tutorial, all you need to know is that `.copy()` works like `.append()`:

``````list.append() # adds something to a list
list.copy() # makes a copy of a list``````

Let’s take another look at that function we wrote for lists, and update it so that what happens inside our function doesn’t change `initial_list`. All we need to do is change the argument we pass to our function from `initial_list` to `initial_list.copy()`

``````initial_list = [1, 2, 3]

def duplicate_last(a_list):
last_element = a_list[-1]
a_list.append(last_element)
return a_list

new_list = duplicate_last(a_list = initial_list.copy()) # making a copy of the list
print(new_list)
print(initial_list)``````
``````[1, 2, 3, 3]
[1, 2, 3]
``````

As we can see, this has fixed our problem. Here’s why: using `.copy()` creates a separate copy of the list, so that instead of pointing to `initial_list` itself, `a_list` points to a new list that starts as a copy of `initial_list`. Any changes that are made to `a_list` after that point are made to that separate list, not `initial_list` itself, thus the global value of `initial_list` is unchanged. This solution still isn’t perfect, though, because we’ll have to remember to add `.copy()` every time we pass an argument to our function or risk accidentally changing the global value of `initial_list`. If we don’t want to have to worry about that, we can actually create that list copy inside the function itself:

``````initial_list = [1, 2, 3]

def duplicate_last(a_list):
copy_list = a_list.copy() # making a copy of the list
last_element = copy_list[-1]
copy_list.append(last_element)
return copy_list

new_list = duplicate_last(a_list = initial_list)
print(new_list)
print(initial_list)``````
``````[1, 2, 3, 3]
[1, 2, 3]
``````

With this approach, we can safely pass a mutable global variable like `initial_list` to our function, and the global value won’t be changed because the function itself makes a copy and then performs its operations on that copy.

The `.copy()` method works for dictionaries, too. As with lists, we can simply add `.copy()` to the argument that we pass our function to create a copy that’ll be used for the function without changing the original variable:

``````content_ratings = {'4+': 4433, '9+': 987, '12+': 1155, '17+': 622}

def make_percentages(a_dictionary):
total = 0
for key in a_dictionary:
count = a_dictionary[key]
total += count

for key in a_dictionary:
a_dictionary[key] = (a_dictionary[key] / total) * 100

return a_dictionary

c_ratings_percentages = make_percentages(content_ratings.copy()) # making a copy of the dictionary
print(c_ratings_percentages)
print(content_ratings)``````
``````{'4+': 61.595109073224954, '9+': 13.714047519799916, '12+': 16.04835348061692, '17+': 8.642489926358204}
{'4+': 4433, '9+': 987, '12+': 1155, '17+': 622}
``````

But again, using that method means we need to remember to add `.copy()` every time we pass a dictionary into our `make_percentages()` function. If we’re going to be using this function frequently, it might be better to implement the copying into the function itself so that we don’t have to remember to do this.

Below, we’ll use `.copy()` inside the function itself. This will ensure that we can use it without changing the global variables we pass to it as arguments, and we don’t need to remember to add `.copy()` to each argument we pass.

``````content_ratings = {'4+': 4433, '9+': 987, '12+': 1155, '17+': 622}

def make_percentages(a_dictionary):
copy_dict = a_dictionary.copy() # create a copy of the dictionary
total = 0
for key in a_dictionary:
count = a_dictionary[key]
total += count

for key in copy_dict: #use the copied table so original isn't changed
copy_dict[key] = (copy_dict[key] / total) * 100

return copy_dict

c_ratings_percentages = make_percentages(content_ratings)
print(c_ratings_percentages)
print(content_ratings)``````
``````{'4+': 61.595109073224954, '9+': 13.714047519799916, '12+': 16.04835348061692, '17+': 8.642489926358204}
{'4+': 4433, '9+': 987, '12+': 1155, '17+': 622}
``````

As we can see, modifying our function to create a copy of our dictionary and then change the counts to percentages only in that copy has allowed us to perform the operation we wanted without actually changing `content_ratings`.

## Conclusions

In this tutorial, we looked at the difference between mutable data types, which can change, and immutable data types, which cannot. We learned how we can use the method `.copy()` to make copies of mutable data types like lists and dictionaries so that we can work with them in functions without changing their global values.

Tags

beginner, dictionaries, functions, immutable data types, lists, mutable data types, python, Tutorials