February 9, 2022

Tutorial: An Introduction to Python Requests Library

The Requests library simplifies making HTTP requests to web servers and working with their responses.
In this tutorial, we will learn how to install and use the library and highlight its main features.

What is Python Requests Library?

The Requests library provides a simple API for interacting with HTTP operations such as GET, POST, etc.
The methods implemented in the Requests library execute HTTP operations against a specific web server specified by its URL.
It also supports sending extra information to a web server through parameters and headers, encoding the server responses, detecting errors, and handling redirects.
In addition to simplifying how we work with the HTTP operations, the Requests library provides some advanced features such as handling HTTP exceptions and authentication that we will discuss in this tutorial.

What is HTTP?

The Hypertext Transfer Protocol (HTTP) is a request/response protocol based on the client-server architecture that relies on TCP/IP connections for exchanging request and response messages.

HTTP clients such as web browsers or mobile applications send requests to an HTTP server, and the server responds to them with messages containing a status line, a header, and a body.

Installing the Requests Library

The first step in getting started with the Requests library is installing it. You can use either pip or conda commands to install the library. To do so, Let’s first create a new Python virtual environment, then install the library in it.

~ % mkdir req-prj
~ % python3 -m venv req-prj/venv
~ % source re   q-prj/venv/bin/activate
(venv) ~ % python3 -m pip install requests

Type the commands above in a terminal window to create the environment on macOS. In the first three commands above, we create the venv environment in the req-prj folder, then activate the environment. Lastly, we install the latest version of the requests package in the environment.


NOTE

If you’re not familiar with Python virtual environments, there’s a great tutorial on the Dataquest blog at A Complete Guide to Python Virtual Environments (2022) – Dataquest.


Now you can import the library and write your first snippet to try it out.

import requests
r = requests.get('https://www.dataquest.io/')
print(r)

Running the code above outputs <Response [200]>, meaning the request is successful and the URL is reachable.

Let’s check the datatype of the r variable in the code above:

print(type(r))

It returns <class 'requests.models.Response'>, which means it’s an instance of Response class. A Response object contains the result of an HTTP request.


NOTE

We use the [httpbin.org] (http://httpbin.org/) website in this tutorial. The httpbin tool is a free and simple HTTP request-and-response service that provides a set of URL endpoints. We use these endpoints to test various ways of working with HTTP operations. We will use the httpbin tools because it helps us focus on learning the Python Requests library without setting up a real web server or using an online web service.


Using GET Request

We use the get method to request data from a specific web server. Let’s try it out:

url = 'http://httpbin.org/json'
r = requests.get(url)
print('Response Code:', r.status_code)
print('Response Headers:\n', r.headers)
print('Response Content:\n',r.text)

Running the code above outputs a status code of 200, which indicates that the URL is reachable. Then it returns the page’s header data followed by the page’s content.

The headers property returns a special dictionary made for only HTTP headers, so you can access each item simply using its key:

print(r.headers['Content-Type'])

Go ahead and run the statement. It returns the page’s content type, application/json.
In the last line of the code, we can use the content property that returns the page content as a series of bytes, but we prefer to use the text property that prints out the page content as decoded text in Unicode format.

Using GET Parameters

We use the GET parameters to pass information in a key-value pair format to a web server through an URL. The get method allows us to pass a dictionary of key-value pairs using the params argument. Let’s try it.

url = 'http://httpbin.org/get'
payload = {
    'website':'dataquest.io', 
    'courses':['Python','SQL']
    }
r = requests.get(url, params=payload)
print('Response Content:\n',r.text)

Run the code above. You’ll see the following output:

Response Content:
 {
  "args": {
    "courses": [
      "Python", 
      "SQL"
    ], 
    "website": "dataquest.io"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.27.1", 
    "X-Amzn-Trace-Id": "Root=1-61e7e066-5d0cacfb49c3c1c3465bbfb2"
  }, 
  "origin": "121.122.65.155", 
  "url": "http://httpbin.org/get?website=dataquest.io&courses=Python&courses=SQL"
}

The response content is in JSON format, and the key-value pairs that we passed through the params argument appear in the args section of the response. Also, the url section contains the encoded URL along with the parameters passed to the server.

Using POST Request

We use the POST request to submit data collected from a web form to a web server. To do this in the Requests library, you need to first create a data dictionary and assign it to the data argument of the post method. Let’s look at an example using the post method:

url = 'http://httpbin.org/post'
payload = {
    'website':'dataquest.io', 
    'courses':['Python','SQL']
    }
r = requests.post(url, data=payload)
print('Response Content:\n',r.text)

Running the code above returns the following response:

Response Content:
 {
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "courses": [
      "Python", 
      "SQL"
    ], 
    "website": "dataquest.io"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "47", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.27.1", 
    "X-Amzn-Trace-Id": "Root=1-61e7ec9f-6333082d7f0b73d317acc1f6"
  }, 
  "json": null, 
  "origin": "121.122.65.155", 
  "url": "http://httpbin.org/post"
}

This time, the submitted data through the post method appears in the form section of the response rather than in the args because instead of sending the data as part of the URL parameters, we sent them as form-encoded data.

Handling Exceptions

Some exceptions may occur while communicating with a remote server. For instance, the server can be unreachable, the requested URL doesn’t exist on the server, or the server doesn’t respond in a proper timeframe. In this section, we’ll learn how to detect and resolve HTTP errors using the HTTPError class of the Requests library.

import requests
from requests import HTTPError

url = "http://httpbin.org/status/404"
try:
    r = requests.get(url)
    r.raise_for_status()
    print('Response Code:', r.status_code)
except HTTPError as ex:
    print(ex)

In the code above, we import the HTTPError class for catching and resolving the HTTP exceptions from the Requests library. Then we try to request an endpoint on httpbin.org, which generates a 404 status code. The raise_for_status() method throws an exception whenever the HTTP response contains an error status code (a 4XX client error or 5XX server error response). The Requests library doesn’t raise an exception automatically once an error occurs. So we need to use the raise_for_status() method to identify whether an error status code has been raised or not. Finally, the exception handler block catches the error and prints out it as follows:

404 Client Error: NOT FOUND for url: http://httpbin.org/status/404
```---
**NOTE**

If you try to access the <code>http://httpbin.org/status/200</code> endpoint, the code above outputs <code>Response Code: 200</code> because the status code is not in the range of error status codes. The <code>raise_for_status()</code> method will return <code>None</code>, which won't trigger the exception handler.
- - - -
In addition to the exception that we just discussed, let’s see how we can resolve server timeouts. It's crucial because we need to ensure our application doesn’t hang indefinitely. In the Requests library, if a request times out, a *Timeout*  exception will occur. To specify the timeout of a request to a number of seconds, we can use the <code>timeout</code> argument:
```py
import requests
from requests import Timeout
url = "http://httpbin.org/delay/10"

try:
    r = requests.get(url, timeout=3)
    print('Response Code:', r.status_code)
except Timeout as ex:
    print(ex)

Let’s first discuss the code. We import the Timeout class to resolve the timeout exceptions and provide a URL that refers to an endpoint with a 10-second delay to test our code. Then, we set the timeout argument to 3 seconds, which means the get request will throw an exception if the server doesn’t respond in 3 seconds. Finally, the exception handler catches the timeout error, if any, and prints out it. So running the code above outputs the following error message because the server won’t respond in the given time.

HTTPConnectionPool(host='httpbin.org', port=80): Read timed out. (read timeout=3)
```## Authentication
The Requests library supports various web authentications, such as basic, digest, the two versions of OAuth, etc. We can use these authentication methods when we want to work with any data sources that require us to be logged in.

We will implement the basic authentication using HTTPBin service in the following example. The endpoint for Basic Auth is /basic-auth/{*user*}/{*password*}. For example, if you use the following endpoint . . .

http://httpbin.org/basic-auth/data/quest

. . . you can authenticate using the username *'data'* and the password *'quest'* by assigning them as a tuple to the <code>auth</code> argument. Once you authenticate successfully, it responds with JSON data <code>{ "authenticated": true, "user": "data"}</code>.
```py
import requests
r = requests.get('http://httpbin.org/basic-auth/data/quest', auth=('data', 'quest'))
print('Response Code:', r.status_code)
print('Response Content:\n', r.text)

Run the code above. It outputs as follows:

Response Code: 200
Response Content:
 {
  "authenticated": true, 
  "user": "data"
}

If we make either the username or password incorrect, it outputs as follows:

Response Code: 401
Response Content:

Conclusion

We learned about one of the most powerful and downloaded Python libraries. We discussed the basics of HTTP and different aspects of the Python Requests library. We also worked with GET and POST requests and learned how to resolve the exceptions of the requests. Finally, we tried to authenticate through the basic authentication method in a web service.

Mehdi Lotfinejad

About the author

Mehdi Lotfinejad

Mehdi is a Senior Data Engineer and Team Lead at ADA. He is a professional trainer who loves writing data analytics tutorials.

Learn data skills for free

Headshot Headshot

Join 1M+ learners

Try free courses