Python Dictionaries: A Comprehensive Tutorial (with 52 Code Examples)
What Is a Dictionary in Python?
A Python dictionary is a data structure that allows us to easily write very efficient code. In many other languages, this data structure is called a hash table because its keys are hashable. We'll understand in a bit what this means.
A Python dictionary is a collection of key:value
pairs. You can think about them as words and their meaning in an ordinary dictionary. Values are said to be mapped to keys. For example, in a physical dictionary, the definition science that searches for patterns in complex data using computer methods is mapped to the key Data Science.
In this Python tutorial, you'll learn how to create a Python dictionary, how to use its methods, and dictionary comprehension, as well as which is better: a dictionary or a list. To get the most out of this tutorial, you should be already familiar with Python lists, for loops, conditional statements, and reading datasets with the reader()
method. If you aren't, you can learn more at Dataquest.
What Are Python Dictionaries Used for?
Python dictionaries allow us to associate a value to a unique key, and then to quickly access this value. It's a good idea to use them whenever we want to find (lookup for) a certain Python object. We can also use lists for this scope, but they are much slower than dictionaries.
This speed is due to the fact that dictionary keys are hashable. Every immutable object in Python is hashable, so we can pass it to the hash()
function, which will return the hash value of this object. These values are then used to lookup for a value associated with its unique key. See the example of the use of the hash()
function below:
print(hash("b"))
2132352943288137677
The string b
has a hash value of 2132352943288137677
. This value may be different in your case.
How to Create a Dictionary?
But let's stop with the theory and go straight to the dictionary creation. We have two main methods to define a dictionary: with curly braces {}
or using the dict()
method. We'll create two empty dictionaries:
# Create a dictionary
dictionary = {} # Curly braces method
another_dictionary = dict() # Dict method
# Are the above dictionaries equivalent?
print(type(dictionary))
print(type(another_dictionary))
print(dictionary == another_dictionary)
True
We can see that both dictionaries have the same data type and are equivalent. Now let's populate a dictionary with keys and values. We can do it by using squared brackets, like this dictionary[key] = value
. We can then access the value by using bracket notation with the key we want the value of between the brackets: dictionary[key]
.
# Populate the dictionary
dictionary["key1"] = "value1"
# Access key1
print(dictionary["key1"])
value1
The value of key1
is returned. We can also create a prepopulated dictionary using the syntax below:
# Create a dictionary with preinserted keys/values
dictionary = {"key1": "value1"}
# Access key1
print(dictionary["key1"])
value1
Ultimately, another method is using dict()
, in which we supply keys and values as a keyword argument list or as a list of tuples:
# Keyword argument list
dictionary = dict(key1="value1", key2="value2")
# Display the dictionary
print(dictionary)
{'key1': 'value1', 'key2': 'value2'}
# List of tuples
dictionary = dict([("key1", "value1"), ("key2", "value2")])
# Display the dictionary
print(dictionary)
{'key1': 'value1', 'key2': 'value2'}
We used the string data type for the key and the value, but what are the other admissible data types? In Python dictionaries, keys should be hashable objects (even if it's not technically correct, we can also say that the objects should be immutable). Thus, mutable data types like lists, aren't allowed. Let's try to hash()
different data types and see what happens:
# Hashing various data types
print(hash(1)) # Integer
print(hash(1.2)) # Float
print(hash("dataquest")) # String
print(hash((1, 2))) # Tuple
print(hash([1, 2, 3]))
1
461168601842738689
-3975257749889514375
-3550055125485641917
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10024/758405818.py in
4 print(hash("dataquest")) # String
5 print(hash((1, 2))) # Tuple
----> 6 print(hash([1, 2, 3]))
TypeError: unhashable type: 'list'
Integers, floats, strings, and tuples are hashable data types (and they are also immutable) while lists are an unhashable data type (and they are mutable). Python uses hash values to quickly access a dictionary's values.
On the other hand, values can be of whatever type. Let's add more elements to the dictionary using different data types:
# Add more elements to the dictionary
dictionary[42] = "the answer to the ultimate question of life, the universe, and everything."
dictionary[1.2] = ["one point two"]
dictionary["list"] = ["just", "a", "list", "with", "an", "integer", 3]
# Display the dictionary
print(dictionary)
{'key1': 'value1', 'key2': 'value2', 42: 'the answer to the ultimate question of life, the universe, and everything.', 1.2: ['one point two'], 'list': ['just', 'a', 'list', 'with', 'an', 'integer', 3]}
Additionally, we can modify the value of a key with bracket notation that we used to populate a dictionary:
# Modify a value
dictionary["list"] = ["it's another", "list"]
# Display the dictionary
print(dictionary)
print()
# Access the value of "list"
print(dictionary["list"])
{'key1': 'value1', 'key2': 'value2', 42: 'the answer to the ultimate question of life, the universe, and everything.', 1.2: ['one point two'], 'list': ["it's another", 'list']}
["it's another", 'list']
Finally, dictionary keys should be unique. Let's try to create a dictionary with duplicate keys:
# Dictionary with duplicate keys
duplicated_keys = {"key1": "value1", "key1": "value2", "key1": "value3"}
# Access key1
print(duplicated_keys["key1"])
value3
Only the value of the last key is returned, so we can technically use duplicate keys, but it's not recommended because one of the strengths of dictionaries is to quickly retrieve a value associated with some key. If there are duplicates, we may return a value we didn't want. Imagine that we look up for the meaning of the word "data" and find 10 different entries for this word in a dictionary; it may be confusing.
Python Dictionary Methods
Now let's see what methods we can use to work with dictionaries.
update()
The update()
method is useful whenever we want to merge dictionaries or add new key:value
pairs using an iterable (iterables are, for instance, lists or tuples). Let's make an example using characters from the Harry Potter universe and the houses they belong to (spoiler: we'll use Harry Potter datasets later on!):
# Create a Harry Potter dictionary
harry_potter_dict = {
"Harry Potter": "Gryffindor",
"Ron Weasley": "Gryffindor",
"Hermione Granger": "Gryffindor"
}
# Display the dictionary
print(harry_potter_dict)
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor'}
Let's now add other characters and their houses using different options we have available for the update()
method:
# Characters to add to the Harry Potter dictionary
add_characters_1 = {
"Albus Dumbledore": "Gryffindor",
"Luna Lovegood": "Ravenclaw"
}
# Merge dictionaries
harry_potter_dict.update(add_characters_1)
# Display the dictionary
print(harry_potter_dict)
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw'}
We can see that the dictionary now contains Albus Dumbledore and Luna Lovegood. We can also use an iterable to add new elements to the dictionary:
# Use iterables to update a dictionary
add_characters_2 = [
["Draco Malfoy", "Slytherin"],
["Cedric Diggory", "Hufflepuff"]
]
harry_potter_dict.update(add_characters_2)
print(harry_potter_dict)
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff'}
We used a list of lists where the first element of each list is the character name and the second element is their house. The update()
method then will automatically associate the first element (key) with the second element (value). For the sake of the experiment, try to update the dictionary with a list of lists but with three elements in each nested list.
We can also use a list of tuples:
# Use iterables to update a dictionary
add_characters_3 = [
("Rubeus Hagrid", "Gryffindor"),
("Minerva McGonagall", "Gryffindor")
]
harry_potter_dict.update(add_characters_3)
print(harry_potter_dict)
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor', 'Minerva McGonagall': 'Gryffindor'}
del
What if we want to delete a key:value
pair from a dictionary? We can use the del
statement. It's essential to say that del
isn't an exclusive dictionary method but rather a Python keyword that we can use in multiple situations to delete whatever Python object (like variable, function, class, list's element, etc.).
# Delete a key:value pair
del harry_potter_dict["Minerva McGonagall"]
print(harry_potter_dict)
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor'}
If we're trying to delete a pair that isn't present in the dictionary, we'll get a KeyError
:
# Delete a key:value pair that doesn't exist in the dictionary
del harry_potter_dict["Voldemort"]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10024/3860415884.py in
1 # Delete a key:value pair that doesn't exist in the dictionary
----> 2 del harry_potter_dict["Voldemort"]
KeyError: 'Voldemort'
popitem()
and pop()
Sometimes, we need to delete the last item that was inserted in a dictionary. The popitem()
method is the way! Note that before Python 3.7, this method removes a random element from a dictionary:
# Insert Voldemort
harry_potter_dict["Voldemort"] = "Slytherin"
print("Dictionary with Voldemort:")
print(harry_potter_dict)
print()
# Remove the last inserted item (Voldemort)
harry_potter_dict.popitem()
print("Dictionary after popping the last inserted item (Voldemort):")
print(harry_potter_dict)
Dictionary with Voldemort:
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor', 'Voldemort': 'Slytherin'}
Dictionary after popping the last inserted item (Voldemort):
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor'}
We can also remove a specific key:value
pair and return the value using the pop()
method:
# Insert Voldemort
harry_potter_dict["Voldemort"] = "Slytherin"
print("Dictionary with Voldemort:")
print(harry_potter_dict)
print()
# Remove the last inserted item (Voldemort)
print("Remove Voldemort and return his house:")
print(harry_potter_dict.pop("Voldemort"))
print()
print("Dictionary after popping the last inserted item (Voldemort):")
print(harry_potter_dict)
Dictionary with Voldemort:
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor', 'Voldemort': 'Slytherin'}
Remove Voldemort and return his house:
Slytherin
Dictionary after popping the last inserted item (Voldemort):
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor'}
get()
If we try to access a value of the key that doesn't exist in the dictionary, Python will return a KeyError
. To get around this problem, we can use the get()
method that will return the value if its key is in the dictionary, or it will return some default value that we set:
# Return an existing value
print(harry_potter_dict.get("Harry Potter", "Key not found"))
# Return a default value if no key found in the dictionary
print(harry_potter_dict.get("Voldemort", "Key not found"))
# Try to retrieve a value of a non-existing key without get
print(harry_potter_dict["Voldemort"])
Gryffindor
Key not found
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10024/3808675782.py in
6
7 # Try to retrieve a value of a non-existing key without get
----> 8 print(harry_potter_dict["Voldemort"])
KeyError: 'Voldemort'
setdefault()
The setdefault()
method is often confused with the get()
method. They perform more or less the same task. Indeed, if we suspect that there is a non-existing key in the dictionary, we can use this method to return a default value. However, in contrast to get()
, this method inserts the default value of this key in the dictionary:
print("Dictionary without Voldemort.")
print(harry_potter_dict)
print()
print("Return the default value of Voldemort.")
print(harry_potter_dict.setdefault("Voldemort", "Slytherin"))
print()
print("Voldemort is now in the dictionary!")
print(harry_potter_dict)
Dictionary without Voldemort.
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor'}
Return the default value of Voldemort.
Slytherin
Voldemort is now in the dictionary!
{'Harry Potter': 'Gryffindor', 'Ron Weasley': 'Gryffindor', 'Hermione Granger': 'Gryffindor', 'Albus Dumbledore': 'Gryffindor', 'Luna Lovegood': 'Ravenclaw', 'Draco Malfoy': 'Slytherin', 'Cedric Diggory': 'Hufflepuff', 'Rubeus Hagrid': 'Gryffindor', 'Voldemort': 'Slytherin'}
items()
, keys()
, and values()
What if we want to return all the key:value
pairs? Or just the keys? What about the values?
The answer to the first question is the items()
method. When used on a dictionary, it will return a dict_items
object, which is essentially a list of tuples, each containing a key and a value. This method may be useful when we loop through a dictionary as we'll see later.
print(harry_potter_dict.items())
dict_items([('Harry Potter', 'Gryffindor'), ('Ron Weasley', 'Gryffindor'), ('Hermione Granger', 'Gryffindor'), ('Albus Dumbledore', 'Gryffindor'), ('Luna Lovegood', 'Ravenclaw'), ('Draco Malfoy', 'Slytherin'), ('Cedric Diggory', 'Hufflepuff'), ('Rubeus Hagrid', 'Gryffindor'), ('Voldemort', 'Slytherin')])
If we want to get just the keys, we should use the keys()
method. It will return a dict_keys
object:
print(harry_potter_dict.keys())
dict_keys(['Harry Potter', 'Ron Weasley', 'Hermione Granger', 'Albus Dumbledore', 'Luna Lovegood', 'Draco Malfoy', 'Cedric Diggory', 'Rubeus Hagrid', 'Voldemort'])
Finally, we have the values()
method that will return the values as a dict_values
object:
print(harry_potter_dict.values())
dict_values(['Gryffindor', 'Gryffindor', 'Gryffindor', 'Gryffindor', 'Ravenclaw', 'Slytherin', 'Hufflepuff', 'Gryffindor', 'Slytherin'])
When Do I Use All These Methods?
After this overview, you may feel overwhelmed by the amount of information. It's also not easy to determine when you should use the Python dictionary methods. No worries — that's absolutely okay. You shouldn't try to remember every single method and its use cases. When you have a real-world problem in front of you (Dataquest guided projects can be a good start), and you have to use dictionaries, just head back to this Python tutorial and see if you can solve your problems with one of these methods. This is the only way you can gain valuable experience and become much faster at using dictionary methods in your future projects!
Looping Through a Dictionary
As we're able to loop through lists, we're also able to loop through dictionaries. They hold two different types of elements, keys and values, so we can either loop through both types of elements simultaneously or just one of them.
First of all, we'll use the items()
method, which yields both keys and values:
for key, value in harry_potter_dict.items():
print((key, value))
('Harry Potter', 'Gryffindor')
('Ron Weasley', 'Gryffindor')
('Hermione Granger', 'Gryffindor')
('Albus Dumbledore', 'Gryffindor')
('Luna Lovegood', 'Ravenclaw')
('Draco Malfoy', 'Slytherin')
('Cedric Diggory', 'Hufflepuff')
('Rubeus Hagrid', 'Gryffindor')
('Voldemort', 'Slytherin')
# Alternatively
for key_value in harry_potter_dict.items():
print(key_value)
('Harry Potter', 'Gryffindor')
('Ron Weasley', 'Gryffindor')
('Hermione Granger', 'Gryffindor')
('Albus Dumbledore', 'Gryffindor')
('Luna Lovegood', 'Ravenclaw')
('Draco Malfoy', 'Slytherin')
('Cedric Diggory', 'Hufflepuff')
('Rubeus Hagrid', 'Gryffindor')
('Voldemort', 'Slytherin')
for key, value in harry_potter_dict.items():
print(f"The current key is {key} and its value is {value}.")
The current key is Harry Potter and its value is Gryffindor.
The current key is Ron Weasley and its value is Gryffindor.
The current key is Hermione Granger and its value is Gryffindor.
The current key is Albus Dumbledore and its value is Gryffindor.
The current key is Luna Lovegood and its value is Ravenclaw.
The current key is Draco Malfoy and its value is Slytherin.
The current key is Cedric Diggory and its value is Hufflepuff.
The current key is Rubeus Hagrid and its value is Gryffindor.
The current key is Voldemort and its value is Slytherin.
We can see that this method allows us to access both keys and values. What if we're only interested in keys? Or only in values?
# Loop only through the keys
for key in harry_potter_dict.keys():
print(key)
Harry Potter
Ron Weasley
Hermione Granger
Albus Dumbledore
Luna Lovegood
Draco Malfoy
Cedric Diggory
Rubeus Hagrid
Voldemort
# Loop only through the values
for value in harry_potter_dict.values():
print(value)
Gryffindor
Gryffindor
Gryffindor
Gryffindor
Ravenclaw
Slytherin
Hufflepuff
Gryffindor
Slytherin
Let's get more practical and also learn a slightly more advanced method. Sometimes, we need to compute the frequency of each value in a dictionary. We can use the Counter()
method from collections
, which is a great Python module with plenty of useful containers that make our coding lives easier.
from collections import Counter
# Frequency of values
counter = Counter(harry_potter_dict.values())
print(counter)
Counter({'Gryffindor': 5, 'Slytherin': 2, 'Ravenclaw': 1, 'Hufflepuff': 1})
The returned object Counter
is actually very similar to a dictionary. We can use the keys()
, values()
, and items()
methods on it!
# Items of Counter
for k, v in counter.items():
print((k, v))
('Gryffindor', 5)
('Ravenclaw', 1)
('Slytherin', 2)
('Hufflepuff', 1)
# Keys of Counter
for k in counter.keys():
print(k)
Gryffindor
Ravenclaw
Slytherin
Hufflepuff
# Values of Counter
for f in counter.values():
print(f)
5
1
2
1
Frequency Tables
Python dictionaries are immensely handy when we have to create so-called frequency tables. Simply put, keys are the objects for which we want to count the frequency, and the values are the frequencies. As an example, we'll be using the Harry Potter Movies Dataset from Kaggle (the Character.csv
dataset). Let's say that we want to count the frequency of each house present in the dataset. To do so, we first have to create an empty dictionary that will contain the frequency table. Then, we have to loop through the list of houses, and if the key for a house is already present in the frequency table, we add 1 to its value. Otherwise, we create a key for the current house and map it to value 1 (it's one because we encounter this element for the first time). We also have to account for missing data in our dataset.
from csv import reader
# Open and read the dataset
opened_file_char = open("Characters.csv", encoding="utf-8-sig")
read_file_char = reader(opened_file_char)
hp_characters = list(read_file_char)
# Initialize an empty dictionary that will hold a frequency table
houses = {}
# Create a frequency table
for character in hp_characters[1:]: # Note that we should not include the header in the looping; therefore, we start from index 1
house = character[4]
if house in houses:
houses[house] += 1
elif house == "":
continue
else:
houses[house] = 1
print(houses)
{'Gryffindor': 31, 'Slytherin': 20, 'Ravenclaw': 12, 'Hufflepuff': 8, 'Beauxbatons Academy of Magic': 2, 'Durmstrang Institute': 2}
Most of the characters from the dataset are from Gryffindor. To practice, try to create frequency tables of the other columns.
Nested Dictionaries
Similar to lists, there are also nested dictionaries. In other words, a dictionary can contain another dictionary! Let's use the Movies.csv
dataset from the same set of Harry Potter datasets. It may happen that in your career, you work with multiple datasets at the same time. One way to organize them is by using dictionaries:
opened_file_movies = open("Movies.csv", encoding="utf-8-sig")
read_file_movies = reader(opened_file_movies)
movies = list(read_file_movies)
# characters key contains Harry Potter characters dataset
# movies key contains movies dataset
hp_datasets = dict(characters=hp_characters, movies=movies)
Now we can easily access each dataset or a specific entry. To illustrate this, let's access the columns of the characters
dataset:
# Columns of the characters dataset
print(hp_datasets["characters"][0])
['Character ID', 'Character Name', 'Species', 'Gender', 'House', 'Patronus', 'Wand (Wood)', 'Wand (Core)']
We can also access the columns of both datasets with a for loop:
# Columns of both datasets
for v in hp_datasets.values():
print(v[0])
['Character ID', 'Character Name', 'Species', 'Gender', 'House', 'Patronus', 'Wand (Wood)', 'Wand (Core)']
['Movie ID', 'Movie Title', 'Release Year', 'Runtime', 'Budget', 'Box Office']
An alternative to this approach (especially when we don't have dozens of datasets) is to reorganize each dataset in a dictionary. It will simplify our work when we have to access different entries:
# Create a dictionary from the characters dataset
characters_dict = dict(
columns = hp_characters[0],
data = hp_characters[1:]
)
# Create a dictionary from the movies dataset
movies_dict = dict(
columns = movies[0],
data = movies[1:]
)
# Access movies columns and their first entry
print("Movies columns:")
print(movies_dict["columns"])
print()
print("The first entry of movies:")
print(movies_dict["data"][0])
Movies columns:
['Movie ID', 'Movie Title', 'Release Year', 'Runtime', 'Budget', 'Box Office']
The first entry of movies:
['1', "Harry Potter and the Philosopher's Stone", '2001', '152', '$125,000,000 ', '$1,002,000,000 ']
Dictionary Comprehension
Dictionary comprehension in Python is an elegant and efficient method to create new dictionaries. You have probably already learned something about list comprehension. Just a quick reminder: comprehension in Python means applying the same operation on each element of an iterable (like a list). Let's illustrate how this technique works. For example, we want a dictionary that holds the runtimes of each of the Harry Potter movies. Let's create it from the dataset's dictionary movies_dict
:
# Dictionary of movies' runtimes
runtimes = {}
for movie in movies_dict["data"]:
name = movie[1]
runtime = int(movie[3])
runtimes[name] = runtime
# Display the runtimes
print(runtimes)
{"Harry Potter and the Philosopher's Stone": 152, 'Harry Potter and the Chamber of Secrets': 161, 'Harry Potter and the Prisoner of Azkaban': 142, 'Harry Potter and the Goblet of Fire': 157, 'Harry Potter and the Order of the Phoenix': 138, 'Harry Potter and the Half-Blood Prince': 153, 'Harry Potter and the Deathly Hallows Part 1': 146, 'Harry Potter and the Deathly Hallows Part 2': 130}
Now we want to convert each runtime from minutes to hours. First of all, we can do it with a regular for loop:
# Dictionary to hold the runtimes in hours
runtimes_hours = {}
# Transform the runtimes into hours
for k, v in runtimes.items():
runtimes_hours[k] = round(v / 60, 2)
# Display the runtimes in hours
print(runtimes_hours)
{"Harry Potter and the Philosopher's Stone": 2.53, 'Harry Potter and the Chamber of Secrets': 2.68, 'Harry Potter and the Prisoner of Azkaban': 2.37, 'Harry Potter and the Goblet of Fire': 2.62, 'Harry Potter and the Order of the Phoenix': 2.3, 'Harry Potter and the Half-Blood Prince': 2.55, 'Harry Potter and the Deathly Hallows Part 1': 2.43, 'Harry Potter and the Deathly Hallows Part 2': 2.17}
However, we can simplify the above code by creating the runtime dictionary in just one line:
print({k:round(v / 60, 2) for k, v in runtimes.items()})
{"Harry Potter and the Philosopher's Stone": 2.53, 'Harry Potter and the Chamber of Secrets': 2.68, 'Harry Potter and the Prisoner of Azkaban': 2.37, 'Harry Potter and the Goblet of Fire': 2.62, 'Harry Potter and the Order of the Phoenix': 2.3, 'Harry Potter and the Half-Blood Prince': 2.55, 'Harry Potter and the Deathly Hallows Part 1': 2.43, 'Harry Potter and the Deathly Hallows Part 2': 2.17}
Let's dissect the code above. First, look at where the for loop is now: we're still looping through the items of the runtimes
dictionary. Now notice the curly braces: we're writing the code inside a dictionary! k
is the current key of our for loop, and after the colon (:
) we perform the operation of rounding and division by 60 directly on v
, which is the value of the for loop.
This code performs exactly the same operations as before, but it does it in 1 line instead of 3 lines.
Moreover, we can also add conditional statements. Let's say that we want to exclude the movies that are shorter than 2.5 hours:
print({k:round(v / 60, 2) for k, v in runtimes.items() if (v / 60) >= 2.5})
{"Harry Potter and the Philosopher's Stone": 2.53, 'Harry Potter and the Chamber of Secrets': 2.68, 'Harry Potter and the Goblet of Fire': 2.62, 'Harry Potter and the Half-Blood Prince': 2.55}
We just add an if-statement, and that's it.
Dictionary comprehension also works with the keys in a similar manner. Try it yourself!
Note that if we have multiple conditional statements or complicated operations, it's better to use a regular for loop because dictionary comprehension may become an incomprehensible coding jungle, which undermines the benefits of Python readability.
Python Dictionary vs List: Which Is Better?
Now that we know more about Python dictionaries, it's time to compare dictionaries and lists. Which is better? Neither better than the other, but they are helpful in different coding tasks.
The rules to choose one of these data structures are actually pretty simple:
- When you just need a sequence of elements that you can access with indexing, choose a list.
- If you need to quickly access an element mapped to a specific unique key, choose a dictionary.
There is a bit more than that. Dictionaries are much faster if we want to access a specific element because they have a constant runtime, which means that the runtime doesn't depend on the size of the input object. In contrast, when we want to see if an element exists in a list, the runtime will depend on the size of this list (Python loops through the entire list). Look at the examples:
import time
# Create a list
lst = [ele for ele in range(10**7)]
now = time.time()
if 3 in lst:
print(True)
list_runtime = time.time() - now
print(f"\nList runtime: {list_runtime} seconds.")
True
List runtime: 0.00010442733764648438 seconds.
# Create a dictionary
d = {i:i*2 for i in range(10**7)}
now = time.time()
if 3 in d.keys():
print(True)
dict_runtime = time.time() - now
print(f"\nDictionary runtime: {dict_runtime} seconds.")
True
Dictionary runtime: 9.512901306152344e-05 seconds.
print(f"Runtime difference between dictionary and list: {list_runtime - dict_runtime} seconds.")
Runtime difference between dictionary and list: 9.298324584960938e-06 seconds.
It may seem that the difference is negligible, but as we increase the input size the difference will skyrocket.
Let's make a more concrete example. Often, we'll want to access a certain element in either a list or a dictionary. To find this element in a list, we first have to loop through the entire list, while in a dictionary, we can quickly access the same element by using its unique key. Let's find 9000000 in the list and the dictionary defined above.
# Find 90000000 in the list
now = time.time()
for i in lst:
if i == 90000000:
break
list_runtime = time.time() - now
print(f"\nList runtime: {list_runtime} seconds.")
List runtime: 0.27323484420776367 seconds.
# Find the value of 90000000 in the dictionary
now = time.time()
num = d[9000000]
dict_runtime = time.time() - now
print(f"\nDictionary runtime: {dict_runtime} seconds.")
Dictionary runtime: 3.62396240234375e-05 seconds.
print(f"Runtime difference between dictionary and list: {list_runtime - dict_runtime} seconds.")
print(f"\nDictionary is faster by {(list_runtime / dict_runtime) * 100} times!")
Runtime difference between dictionary and list: 0.27319860458374023 seconds.
Dictionary is faster by 753967.1052631579 times!
It took the dictionary almost no time to locate the number, while the list took around 1 second to perform the same operation. The dictionary is almost one million times faster!
Bonus: Using defaultdict()
to Handle Missing Keys
Recall that we used the setdefault()
method to insert a default key and its value in a dictionary. We also used the get()
method to return a default value of a non-existing key. A more Pythonic way to perform similar operations is by using defaultdict()
from the collections
module(). We can initialize a dictionary with a default value data type by calling it and passing the data type we want to the method. Now if we try to access a missing key, the dictionary will create this key and map a default value to it:
# Import defaultdict
from collections import defaultdict
# Initialize a default dictionary with the list data type
default_d = defaultdict(list)
# Call a missing key
print(default_d["missing_key"])
[]
Here the method created a key missing_key
and assigned an empty list to it because that's the default value of our dictionary. We can now append some values to this list:
# Append values to missing_key
for i in range(1, 6):
default_d["missing_key"].append(f"value{i}")
# Call "missing_key"
print(default_d["missing_key"])
print()
# Display default_d
print(default_d)
['value1', 'value2', 'value3', 'value4', 'value5']
defaultdict(, {'missing_key': ['value1', 'value2', 'value3', 'value4', 'value5']})
The arguments we pass to defaultdict()
must be callable. If we pass a non-callable object to defaultdict()
we'll get a TypeError
:
default_d = defaultdict(0)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10024/3911435701.py in
----> 1 default_d = defaultdict(0)
TypeError: first argument must be callable or None
In contrast, we can pass whatever object to the setdefault()
method:
d = {}
d.setdefault(0, 0)
print(d)
{0: 0}
Let's also look at the get()
method. We use it when we want to return a value of a key we suspect doesn't exist. This method will return only the value but won't change the dictionary in any way:
# Return the default value of a missing key
print(d.get(3, "three"))
print()
# Display d
print(d)
three
{0: 0}
Now we should be able to understand the difference between these three methods.
Conclusion
Here's what we've covered in this tutorial:
- Dictionaries in Python
- How dictionaries allow us to quickly access a certain Python object
- Creating a dictionary with the
dict()
method or curly braces - Python dictionary methods
- Looping through a dictionary
- Creating frequency tables
- Nested dictionaries
- Dictionary comprehension
- When to use a list or a dictionary
- Using
defaultdict()
to handle missing keys
Feel free to connect with me on LinkedIn or GitHub. Happy coding!