# Python Cheat Sheet for Data Science: Intermediate

The printable version of this cheat sheet

The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it's nice to have a handy reference, so we've put together this cheat sheet to help you out!

This cheat sheet is the companion to our Python Basics Data Science Cheat Sheet

If you'd like to learn Python, we have a Python Programming: Beginner course which can start you on your data science journey.

Download a Printable PDF of this Cheat Sheet

## Key Basics, Printing and Getting Help

This cheat sheet assumes you are familiar with the content of our Python Basics Cheat Sheet.

`s`

| A Python string variable

`i`

| A Python integer variable

`f`

| A Python float variable

`l`

| A Python list variable

`d`

| A Python dictionary variable

## Lists

`l.pop(3)`

| Returns the fourth item from `l`

and deletes it from the list

`l.remove(x)`

| Removes the first item in `l`

that is equal to `x`

`l.reverse()`

| Reverses the order of the items in `l`

`l[1::2]`

| Returns every second item from `l`

, commencing from the 1st item

`l[-5:]`

| Returns the last `5`

items from `l`

## Strings

`s.lower()`

| Returns a lowercase version of `s`

`s.title()`

| Returns `s`

with the first letter of every word capitalized

`"23".zfill(4)`

| Returns `"0023"`

by left-filling the string with `0`

's to make it's length `4`

.

`s.splitlines()`

| Returns a list by splitting the string on any newline characters.

*Python strings share some common methods with lists*|

`s[:5]`

| Returns the first `5`

characters of `s`

`"fri" + "end"`

| Returns `"friend"`

`"end" in s`

| Returns `True`

if the substring `"end"`

is found in `s`

## Range

*Range objects are useful for creating sequences of integers for looping.*

`range(5)`

| Returns a sequence from `0`

to `4`

`range(2000,2018)`

| Returns a sequence from `2000`

to `2017`

`range(0,11,2)`

| Returns a sequence from `0`

to `10`

, with each item incrementing by `2`

`range(0,-10,-1)`

| Returns a sequence from `0`

to `-9`

`list(range(5))`

| Returns a list from `0`

to `4`

## Dictionaries

`max(d, key=d.get)`

| Return the key that corresponds to the largest value in `d`

`min(d, key=d.get)`

| Return the key that corresponds to the smallest value in `d`

## Sets

`my_set = set(l)`

| Returns a set object containing the *unique* values from `l`

`len(my_set)`

| Returns the number of objects in `my_set`

(or, the number of *unique* values from `l`

)

`a in my_set`

| Returns `True`

if the value `a`

exists in `my_set`

## Regular expressions

`import re`

| Import the Regular Expressions module

`re.search("abc",s)`

| Returns a `match`

object if the regex `"abc"`

is found in `s`

, otherwise `None`

`re.sub("abc","xyz",s)`

| Returns a string where all instances matching regex `"abc"`

are replaced by `"xyz"`

## List comprehension

*A one-line expression of a for loop*

`[i ** 2 for i in range(10)]`

| Returns a list of the squares of values from `0`

to `9`

`[s.lower() for s in l_strings]`

| Returns the list `l_strings`

, with each item having had the `.lower()`

method applied

`[i for i in l_floats if i < 0.5]`

| Returns the items from `l_floats`

that are less than `0.5`

## Functions for looping

```
for i, value in enumerate(l):
print("The value of item {} is {}".format(i,value))
```

Iterates over the list l, printing the index location of each item and its value

```
for one, two in zip(l_one,l_two):
print("one: {}, two: {}".format(one,two))
```

Iterates over two lists, `l_one`

and `l_two`

and print each value

```
while x < 10:
x += 1
```

Runs the code in the body of the loop until the value of `x`

is no longer less than `10`

## Datetime

`import datetime as dt`

| Imports the `datetime`

module

`now = dt.datetime.now()`

| Assigns `datetime`

object representing the current time to `now`

`wks4 = dt.datetime.timedelta(weeks=4)`

| Assigns a `timedelta`

object representing a timespan of 4 weeks to `wks4`

`now - wks4`

| Returns a `datetime`

object representing the time 4 weeks prior to `now`

`newyear_2020 = dt.datetime(year=2020, month=12, day=31)`

| Assigns a `datetime`

object representing December 25, 2020 to `newyear_2020`

`newyear_2020.strftime("%A, %b %d, %Y")`

| Returns `"Thursday, Dec 31, 2020"`

`dt.datetime.strptime('Dec 31, 2020',"%b %d, %Y")`

| Returns a `datetime`

object representing December 31, 2020

## Random

`import random`

| Imports the `random`

module

`random.random()`

| Returns a random float between `0.0`

and `1.0`

`random.randint(0,10)`

| Returns a random integer between `0`

and `10`

`random.choice(l)`

| Returns a random item from the list `l`

## Counter

`from collections import Counter`

| Imports the `Counter`

class

`c = Counter(l)`

| Assigns a `Counter`

(dict-like) object with the counts of each unique item from `l`

, to `c`

`c.most_common(3)`

| Returns the 3 most common items from `l`

## Try/Except

*Catch and deal with errors*

`l_ints = [1, 2, 3, "", 5]`

Assigns a list of integers with one missing value to `l_ints`

```
l_floats = []
for i in l_ints:
try:
l_floats.append(float(i))
except:
l_floats.append(i)
```

Converts each value of `l_ints`

to a float, catching and handling `ValueError: could not convert string to float`

: where values are missing.

## Download a printable version of this cheat sheet

If you'd like to download a printable version of this cheat sheet you can do so below.