Python Cheat Sheet for Data Science: Intermediate
The printable version of this cheat sheet
The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it's nice to have a handy reference, so we've put together this cheat sheet to help you out!
This cheat sheet is the companion to our Python Basics Data Science Cheat Sheet
If you'd like to learn Python, we have a Python Programming: Beginner course which can start you on your data science journey.
Download a Printable PDF of this Cheat Sheet
Key Basics, Printing and Getting Help
This cheat sheet assumes you are familiar with the content of our Python Basics Cheat Sheet.
s
| A Python string variable
i
| A Python integer variable
f
| A Python float variable
l
| A Python list variable
d
| A Python dictionary variable
Lists
l.pop(3)
| Returns the fourth item from l
and deletes it from the list
l.remove(x)
| Removes the first item in l
that is equal to x
l.reverse()
| Reverses the order of the items in l
l[1::2]
| Returns every second item from l
, commencing from the 1st item
l[-5:]
| Returns the last 5
items from l
Strings
s.lower()
| Returns a lowercase version of s
s.title()
| Returns s
with the first letter of every word capitalized
"23".zfill(4)
| Returns "0023"
by left-filling the string with 0
's to make it's length 4
.
s.splitlines()
| Returns a list by splitting the string on any newline characters.
Python strings share some common methods with lists|
s[:5]
| Returns the first 5
characters of s
"fri" + "end"
| Returns "friend"
"end" in s
| Returns True
if the substring "end"
is found in s
Range
Range objects are useful for creating sequences of integers for looping.
range(5)
| Returns a sequence from 0
to 4
range(2000,2018)
| Returns a sequence from 2000
to 2017
range(0,11,2)
| Returns a sequence from 0
to 10
, with each item incrementing by 2
range(0,-10,-1)
| Returns a sequence from 0
to -9
list(range(5))
| Returns a list from 0
to 4
Dictionaries
max(d, key=d.get)
| Return the key that corresponds to the largest value in d
min(d, key=d.get)
| Return the key that corresponds to the smallest value in d
Sets
my_set = set(l)
| Returns a set object containing the unique values from l
len(my_set)
| Returns the number of objects in my_set
(or, the number of unique values from l
)
a in my_set
| Returns True
if the value a
exists in my_set
Regular expressions
import re
| Import the Regular Expressions module
re.search("abc",s)
| Returns a match
object if the regex "abc"
is found in s
, otherwise None
re.sub("abc","xyz",s)
| Returns a string where all instances matching regex "abc"
are replaced by "xyz"
List comprehension
A one-line expression of a for loop
[i ** 2 for i in range(10)]
| Returns a list of the squares of values from 0
to 9
[s.lower() for s in l_strings]
| Returns the list l_strings
, with each item having had the .lower()
method applied
[i for i in l_floats if i < 0.5]
| Returns the items from l_floats
that are less than 0.5
Functions for looping
for i, value in enumerate(l):
print("The value of item {} is {}".format(i,value))
Iterates over the list l, printing the index location of each item and its value
for one, two in zip(l_one,l_two):
print("one: {}, two: {}".format(one,two))
Iterates over two lists, l_one
and l_two
and print each value
while x < 10:
x += 1
Runs the code in the body of the loop until the value of x
is no longer less than 10
Datetime
import datetime as dt
| Imports the datetime
module
now = dt.datetime.now()
| Assigns datetime
object representing the current time to now
wks4 = dt.datetime.timedelta(weeks=4)
| Assigns a timedelta
object representing a timespan of 4 weeks to wks4
now - wks4
| Returns a datetime
object representing the time 4 weeks prior to now
newyear_2020 = dt.datetime(year=2020, month=12, day=31)
| Assigns a datetime
object representing December 25, 2020 to newyear_2020
newyear_2020.strftime("%A, %b %d, %Y")
| Returns "Thursday, Dec 31, 2020"
dt.datetime.strptime('Dec 31, 2020',"%b %d, %Y")
| Returns a datetime
object representing December 31, 2020
Random
import random
| Imports the random
module
random.random()
| Returns a random float between 0.0
and 1.0
random.randint(0,10)
| Returns a random integer between 0
and 10
random.choice(l)
| Returns a random item from the list l
Counter
from collections import Counter
| Imports the Counter
class
c = Counter(l)
| Assigns a Counter
(dict-like) object with the counts of each unique item from l
, to c
c.most_common(3)
| Returns the 3 most common items from l
Try/Except
Catch and deal with errors
l_ints = [1, 2, 3, "", 5]
Assigns a list of integers with one missing value to l_ints
l_floats = []
for i in l_ints:
try:
l_floats.append(float(i))
except:
l_floats.append(i)
Converts each value of l_ints
to a float, catching and handling ValueError: could not convert string to float
: where values are missing.
Download a printable version of this cheat sheet
If you'd like to download a printable version of this cheat sheet you can do so below.