Python Cheat Sheet for Data Science: Intermediate

Python Cheat Sheet for Data Science: Intermediate

The printable version of this cheat sheet

The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out!

This cheat sheet is the companion to our Python Basics Data Science Cheat Sheet

If you’re interested in learning Python, we have a free Python Programming: Beginner course which can start you on your data science journey.

Key Basics, Printing and Getting Help

This cheat sheet assumes you are familiar with the content of our Python Basics Cheat Sheet

s A Python string variable
i A Python integer variable
f A Python float variable
l A Python list variable
d A Python dictionary variable


l.pop(3) Returns the fourth item from l and deletes it from the list
l.remove(x) Removes the first item in l that is equal to x
l.reverse() Reverses the order of the items in l
l[1::2] Returns every second item from l, commencing from the 1st item
l[-5:] Returns the last 5 items from l


s.lower() Returns a lowercase version of s
s.title() Returns s with the first letter of every word capitalized
"23".zfill(4) Returns "0023" by left-filling the string with 0’s to make it’s length 4.
s.splitlines() Returns a list by splitting the string on any newline characters.
Python strings share some common methods with lists  
s[:5] Returns the first 5 characters of s
"fri" + "end" Returns "friend"
"end" in s Returns True if the substring "end" is found in s


Range objects are useful for creating sequences of integers for looping.

range(5) Returns a sequence from 0 to 4
range(2000,2018) Returns a sequence from 2000 to 2017
range(0,11,2) Returns a sequence from 0 to 10, with each item incrementing by 2
range(0,-10,-1) Returns a sequence from 0 to -9
list(range(5)) Returns a list from 0 to 4


max(d, key=d.get) Return the key that corresponds to the largest value in d
min(d, key=d.get) Return the key that corresponds to the smallest value in d


my_set = set(l) Returns a set object containing the unique values from l
len(my_set) Returns the number of objects in my_set (or, the number of unique values from l)
a in my_set Returns True if the value a exists in my_set

Regular expressions

import re Import the Regular Expressions module"abc",s) Returns a match object if the regex "abc" is found in s, otherwise None
re.sub("abc","xyz",s) Returns a string where all instances matching regex "abc" are replaced by "xyz"

List comprehension

A one-line expression of a for loop

[i ** 2 for i in range(10)] Returns a list of the squares of values from 0 to 9
[s.lower() for s in l_strings] Returns the list l_strings, with each item having had the .lower() method applied
[i for i in l_floats if i < 0.5] Returns the items from l_floats that are less than 0.5

Functions for looping

for i, value in enumerate(l):
    print("The value of item {} is {}".format(i,value))
Iterates over the list l, printing the index location of each item and its value
for one, two in zip(l_one,l_two):
    print("one: {}, two: {}".format(one,two))
Iterates over two lists, l_one and l_two and print each value
while x < 10:
    x += 1
Runs the code in the body of the loop until the value of x is no longer less than 10


import datetime as dt Import sthe datetime module
now = Assigns datetime object representing the current time to now
wks4 = dt.datetime.timedelta(weeks=4) Assigns a timedelta object representing a timespan of 4 weeks to wks4
now - wks4 Returns a datetime object representing the time 4 weeks prior to now
newyear_2020 = dt.datetime(year=2020, month=12, day=31) Assigns a datetime object representing December 25, 2020 to newyear_2020
newyear_2020.strftime("%A, %b %d, %Y") Returns "Thursday, Dec 31, 2020"
dt.datetime.strptime('Dec 31, 2020',"%b %d, %Y") Returns a datetime object representing December 31, 2020


import random Imports the random module
random.random() Returns a random float between 0.0 and 1.0
random.randint(0,10) Returns a random integer between 0 and 10
random.choice(l) Returns a random item from the list l


from collections import Counter Imports the Counter class
c = Counter(l) Assigns a Counter (dict-like) object with the counts of each unique item from l, to c
c.most_common(3) Returns the 3 most common items from l


Catch and deal with errors

l_ints = [1, 2, 3, "", 5]
Assigns a list of integers with one missing value to l_ints
l_floats = []
for i in l_ints:
Converts each value of l_ints to a float, catching and handling ValueError: could not convert string to float: where values are missing.

Test out the commands in the cheat sheet

If you want to test out some of the commands in the cheat sheet, you can use the interactive Python editor below:

Download a printable version of this cheat sheet

If you’d like to download a printable version of this cheat sheet you can do so below.