March 19, 2024

Python API Tutorial: Getting Started with APIs

In this Python API tutorial, we'll explore how to retrieve data for AI and data science projects using APIs (Application Programming Interfaces). APIs play an increasingly crucial role in the age of artificial intelligence and data science by providing access to vast amounts of data. This data can be used to train machine learning models, power AI applications, and enable comprehensive data analysis. Furthermore, millions of APIs are available online, with websites like Reddit, X (formerly Twitter), and Facebook offering access to certain data through their APIs.

python api tutorial about the ISS

 

How APIs Benefit AI and Data Science Projects

To use an API, you make a request to a remote web server and retrieve the data you need. APIs offer several key benefits for AI and data science projects:

  • Real-time data access: APIs allow you to retrieve up-to-date data on demand, which is crucial for AI models and data science projects that require real-time data to make accurate predictions or decisions.
  • Large datasets: Training AI models often requires large amounts of data. APIs provide a way to access and integrate data from multiple sources without the need for local storage and management.
  • Pre-processed data: Some APIs offer pre-processed or enriched data, such as sentiment analysis or entity recognition, saving significant time and resources in AI projects.

When to Use APIs Instead of Static Datasets

APIs are particularly useful in the following scenarios:

  • Rapidly changing data: For data that changes quickly, like stock prices, using an API is more efficient than repeatedly downloading a static dataset.
  • Specific subsets of data: If you only need a small piece of a larger dataset, such as your own comments on Reddit, an API allows you to retrieve just the relevant data.
  • Complex computations: APIs like Spotify's can provide information like music genres, leveraging their extensive data and computational resources.

Mastering API Integration for AI and Data Science

In this tutorial, we'll query a simple API to retrieve data about the International Space Station (ISS). The principles and techniques covered provide a foundation for working with APIs in any context, including more complex APIs for machine learning, natural language processing, and computer vision.

As you progress in your AI or data science journey, mastering API integration will allow you to:

  • Access large datasets to train and improve AI models
  • Incorporate AI-powered services into your projects
  • Retrieve real-time data streams for AI applications

About this Python API Tutorial

This tutorial is based on a part of our interactive APIs and web scraping course in Python, which you can start for free. The course assumes you have some knowledge of working with data in Python. If that's not you, consider trying our free introduction to Python programming course first, if you're interested in data science applications. If you want to work with AI, check out our Generative AI Fundamentals in Python skill path.

Throughout this tutorial, we'll demonstrate how to work with APIs using Python and the requests library. You'll learn how to make API requests, handle response data, and integrate API data into your AI and data science workflows. For more advanced API concepts like authentication, pagination, and rate limiting, check out our intermediate Python API tutorial.

What is an API?

An API, or Application Programming Interface, is a server that you can use to retrieve and send data to using code. APIs provide essential tools in the world of artificial intelligence (AI) and data science, enabling access to vast amounts of data and powerful computing capabilities.

More specifically, APIs play a crucial role in AI and data science projects by enabling:

  • Access to AI models: APIs allow developers to integrate pre-trained AI models into their applications, such as natural language processing (NLP) models from OpenAI's GPT-4 API or computer vision models from Google Cloud Vision API.
  • Data acquisition: APIs provide access to large datasets required for training machine learning models. This includes public datasets like those offered by Kaggle or data from social media platforms like X (formerly Twitter) and Facebook.
  • AI-powered services: Many companies offer APIs that allow developers to integrate AI capabilities into their applications without building the models themselves. Examples include sentiment analysis, entity recognition, and language translation.

Some popular AI APIs include:

  • OpenAI API: Offers access to powerful language models like GPT-4 for natural language tasks such as text generation, completion, and summarization.
  • Google Cloud AI APIs: A suite of APIs for integrating vision, language, and conversation capabilities into applications.
  • IBM Watson API: Provides a range of AI services, including NLP, sentiment analysis, and computer vision.

When we want to receive data from an API, we need to make a request. Developers use requests all over the web. For instance, when you visited this blog post, your web browser made a request to the Dataquest web server, which responded with the content of this web page.

Similarly, API requests work in exactly the same way – you make a request to an API server for data, and it responds to your request.

Making API Requests in Python

In order to work with APIs in Python, we need tools that will make those requests. The most common library for making requests and working with APIs in Python is the requests library. Since the requests library isn't part of the standard Python library, you'll need to install it to get started.

As mentioned earlier, APIs are crucial in the world of artificial intelligence (AI) and data science. They provide access to vast amounts of data, pre-trained AI models, and AI-powered services that can significantly enhance AI and data science projects. Some key benefits of using APIs in AI and data science include:

  • Access to large datasets for training machine learning models
  • Integration of powerful AI models, such as natural language processing or computer vision, into applications
  • Real-time data streams for AI applications that require up-to-date information

For example, you could use the OpenAI API to access state-of-the-art language models like GPT-4 for natural language processing tasks, or integrate Cloud Vision API to add computer vision capabilities to your project.

If you use pip to manage your Python packages, you can install the requests library using the following command:

pip install requests

If you use conda instead, the command you'll need is:

conda install requests

Once you've installed the library, you'll need to import it. Let's start with that important step:

import requests

Now that we've installed and imported the requests library, let's use it in our example.

Making Our First API Request

In this post, we'll learn the basics of making API requests in Python. We'll cover how to make a simple GET request, and how to interpret the status codes that are returned.

APIs use many different request types. GET, the most common type, retrieves data. We'll focus on GET requests since we're just working on retrieving data for now.

When we make an API request, the response includes a status code that tells us whether the request was successful. Status codes are important for immediately identifying if something went wrong. To make a GET request, we use the requests.get() function, passing in the URL we want to request. Let's start by requesting an API endpoint that doesn't exist, so we can see what an error status code looks like:

response = requests.get("http://api.open-notify.org/this-api-doesnt-exist") 
print(response.status_code)
404

The 404 status code is probably familiar - it's what a server returns when it can't find the requested file. Here, we asked for this-api-doesnt-exist which (unsurprisingly) didn't exist!

Understanding Common API Status Codes

Every request to a web server returns a status code indicating what happened with the request. Here are some common codes relevant to GET requests:

  • 200: Everything went okay, and the result has been returned (if any).
  • 301: The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint name is changed.
  • 400: The server thinks you made a bad request. This happens when you send incorrect data or make other client-side errors.
  • 401: The server thinks you're not authenticated. Many APIs require login credentials, so this happens when you don't send the right credentials to access an API.
  • 403: The resource you're trying to access is forbidden: you don't have the right permissions to see it.
  • 404: The resource you tried to access wasn't found on the server.
  • 503: The server is not ready to handle the request.

Notice that all the codes starting with 4 indicate some sort of client-side error, while 5xx codes point to server-side issues. The first digit of the status code indicates its category. Knowing this makes it easy to quickly identify if a request was successful (2xx) or if there was an error (4xx or 5xx). Read more about status codes here if you're interested in learning more.

Now that you understand the basics of API requests and status codes, you're ready to start making your own requests and handling the responses. In the next section, we'll look at some more practical examples of working with real-world APIs in Python.

API Documentation

Consulting Documentation for Successful API Requests

When working with APIs, especially in the context of AI and data science, consulting the documentation is crucial for making successful requests. API documentation from providers like OpenAI, Google Cloud AI, or IBM Watson outlines how to effectively use their services.

Key Elements of API Documentation

API documentation typically includes information on available endpoints, required parameters, authentication methods, and expected response formats. For example, the OpenAI API documentation provides detailed guidance on using their language models, such as GPT-4, for various natural language processing tasks.

Exploring the Open Notify API

In this Python API tutorial, we'll work with the Open Notify API, which provides access to data about the international space station. This API is great for learning because of its simple design and lack of authentication requirements.

Understanding API Endpoints

APIs often have multiple endpoints available on a server. We'll start with the http://api.open-notify.org/astros.json endpoint, which returns data about astronauts currently in space. The documentation states that this API takes no inputs, making it simple to get started with.

Making a GET Request to the API

Let's make a GET request to the endpoint using the requests library:

response = requests.get("http://api.open-notify.org/astros") 
print(response.status_code)
200

The '200' code indicates that our request was successful. According to the documentation, the API response is in JSON format. Before diving into JSON, let's use the response.json() method to view the data we received:

print(response.json())
{'people': [{'craft': 'ISS', 'name': 'Oleg Kononenko'}, 
{'craft': 'ISS', 'name': 'Nikolai Chub'}, 
{'craft': 'ISS', 'name': 'Tracy Caldwell Dyson'}, 
{'craft': 'ISS', 'name': 'Matthew Dominick'}, 
{'craft': 'ISS', 'name': 'Michael Barratt'}, 
{'craft': 'ISS', 'name': 'Jeanette Epps'}, 
{'craft': 'ISS', 'name': 'Alexander Grebenkin'}, 
{'craft': 'ISS', 'name': 'Butch Wilmore'}, 
{'craft': 'ISS', 'name': 'Sunita Williams'}, 
{'craft': 'Tiangong', 'name': 'Li Guangsu'}, 
{'craft': 'Tiangong', 'name': 'Li Cong'}, 
{'craft': 'Tiangong', 'name': 'Ye Guangfu'}], 
'number': 12, 'message': 'success'}

The Importance of API Documentation

In summary, understanding API documentation is essential for effectively leveraging the powerful capabilities offered by AI and data science APIs in your projects. By carefully reading the documentation and following the provided guidelines, you can ensure that you are making the most of these APIs.

Working with JSON Data in Python

What is JSON?

JSON (JavaScript Object Notation) is the language of APIs. It encodes data structures for easy machine readability. APIs primarily pass data back and forth in JSON format.

JSON in AI and Data Science

Moreover, in the world of artificial intelligence (AI) and data science, JSON plays a crucial role in enabling different systems to exchange data. Many AI and data science APIs use JSON as the standard format for requests and responses. These include APIs from OpenAI, Google Cloud AI, and IBM Watson. This allows developers to easily integrate AI capabilities from various APIs into their applications.

To illustrate, you might have noticed that the JSON output we received from the API looked like it contained Python dictionaries, lists, strings and integers. You can think of JSON as being a combination of these objects represented as strings:

 

Working with JSON in Python

Furthermore, Python's json package provides great JSON support. The json package is part of the standard library, so we don't have to install anything to use it. We can both convert lists and dictionaries to JSON, and convert strings to lists and dictionaries. In the case of our ISS Pass data, it is a dictionary encoded to a string in JSON format.

The json library has two main functions:

  • json.dumps() — Takes in a Python object, and converts (dumps) it to a string.
  • json.loads() — Takes a JSON string, and converts (loads) it to a Python object.

The dumps() function is particularly useful as we can use it to print a formatted string which makes it easier to understand the JSON output:

import json

# create a formatted string of the Python JSON object
def jprint(obj):  
    text = json.dumps(obj, sort_keys=True, indent=4) 
    print(text) 

jprint(response.json())

Examining the API Response

{
    "message": "success",
    "number": 12,
    "people": [
        {
            "craft": "ISS",
            "name": "Oleg Kononenko"
        },
        {
            "craft": "ISS",
            "name": "Nikolai Chub"
        },
        {
            "craft": "ISS",
            "name": "Tracy Caldwell Dyson"
        },
        {
            "craft": "ISS",
            "name": "Matthew Dominick"
        },
        {
            "craft": "ISS",
            "name": "Michael Barratt"
        },
        {
            "craft": "ISS",
            "name": "Jeanette Epps"
        },
        {
            "craft": "ISS",
            "name": "Alexander Grebenkin"
        },
        {
            "craft": "ISS",
            "name": "Butch Wilmore"
        },
        {
            "craft": "ISS",
            "name": "Sunita Williams"
        },
        {
            "craft": "Tiangong",
            "name": "Li Guangsu"
        },
        {
            "craft": "Tiangong",
            "name": "Li Cong"
        },
        {
            "craft": "Tiangong",
            "name": "Ye Guangfu"
        }
    ]
    }

Immediately we can understand the structure of the data more easily. The formatted output shows the data contains 12 people currently in space, with their names existing as dictionaries inside a list.

If we compare this to the documentation for the endpoint we'll see that this matches the specified output for the endpoint.

JSON's simplicity and universality make it an essential tool for AI and data science professionals working with APIs. By providing a standard format for data exchange, JSON enables interoperability between diverse AI systems and allows developers to leverage the power of multiple AI APIs in their projects.

Using an API with Query Parameters

The http://api.open-notify.org/astros.json endpoint we used earlier does not take any parameters. We just send a GET request and the API sends back data about the number of people currently in space.

However, API endpoints commonly require us to specify parameters. This is especially true for APIs used in artificial intelligence (AI) and data science applications. Query parameters allow us to customize the behavior of AI models, access specific subsets of data, and more.

How Query Parameters Relate to LLMs

LLMs like ChatGPT are trained on extensive datasets to grasp the nuances of human language and generate relevant responses. When you prompt ChatGPT with a question, such as 'Explain the theory of relativity,' it filters through its dataset to find and compile information that aligns with your request. This process is analogous to using query parameters in an API to extract a subset of data from a larger dataset.

Just as you might use a filter_by parameter (as we will see in a moment) in an API call to get data about a specific country or topic, ChatGPT uses its trained parameters to filter and generate a context-specific response. In both cases, the underlying concept involves selecting the most relevant information from a large pool of data to meet the specific needs of the query.

Understanding how query parameters work in API interactions provides valuable insights into the principles underlying LLMs' data processing and response generation. As we explore these concepts, you'll gain a deeper appreciation for the complexity and power of both APIs and LLMs in handling and interpreting vast amounts of information.

The World Bank's Development Indicators API

For this example, we'll focus on the World Bank's Development Indicators, a comprehensive database containing detailed global development data for over 200 countries, some dating back more than 50 years. To enhance our interaction with this extensive resource, we have built a dedicated side server featuring our own APIs (https://api-server.dataquest.io/economic_data). This setup will provide streamlined access to the database, allowing us to efficiently utilize these valuable indicators in our coursework.

Query Parameters: Like Customizing a Burger Order

Imagine you're at a restaurant, and you order a burger. However, you don't just want any burger, you want it with extra cheese, no pickles, and a side of sweet potato fries instead of regular fries. In this scenario, the extra cheese, no pickles, and sweet potato fries are all specific instructions or parameters to modify your 'request' for a burger.

In the world of APIs, we often want to do something similar. Rather than retrieving all the data an API offers, we might want only a subset of that data. This is where optional query parameters come in. They allow us to specify or filter the data we want from an API, much like adding specific instructions to our burger order.

Filtering Data with Query Parameters

Optional query parameters enable selecting a subset of data from an API, instead of retrieving all available data. For instance, to filter data to only include countries in Sub-Saharan Africa, we use a query parameter in the URL, like https://api-server.dataquest.io/economic_data/countries?filter_by=region=Sub-Saharan%20Africa.

The World Development Indicators API enables refined searches by supporting these parameters. It has several endpoints, including /countries, /indicators, /country_series, /series_time, /footnotes, and /historical-data.

Using Query Parameters in Python

For instance, sending a GET request to the API without query parameters looks like this:

import requests 

response = requests.get("https://api-server.dataquest.io/economic_data/countries") 
data = response.json() 

The data variable now contains a list of all countries in the database. However, if we are specifically interested in countries within a certain region, such as Sub-Saharan Africa, we need to utilize query parameters to refine our request. Keep in mind that APIs accept different query parameters, which is why consulting the API's documentation is essential. In this case, our API supports the filter_by parameter, allowing for more targeted searches.

To illustrate, we can modify our request to include this parameter:

response = requests.get("https://api-server.dataquest.io/economic_data/countries?filter_by=region=Sub-Saharan%20Africa") 
data = response.json() 

In this URL, the filter_by=region=Sub-Saharan%20Africa segment is a query parameter. Here's what each part means:

  • ? is a delimiter that marks the beginning of the query string. It separates the path of the URL from the parameters that are being passed.
  • filter_by indicates the type of filtering we are applying.
  • region is a specific field in the API's database that we want to filter by. In this context, region refers to the geographical area of the countries.
  • The first = sign following filter_by is used to assign the filtering criteria (region in this case), and the second = sign assigns the specific value (Sub-Saharan Africa) to the region field.
  • %20 is URL encoding for a space character, necessary because URLs cannot contain actual space characters. However, when composing a GET request in an editor or a tool, you don't need to manually type %20 for spaces; it is typically handled automatically by the software.

Now, the data variable holds a list of countries specifically in the Sub-Saharan Africa region. This shows how to effectively use query parameters supported by an API to refine data requests according to our requirements.

Concluding this Python API Tutorial

This example demonstrates the power of APIs to provide real-time data that can be integrated into data science and AI applications. By making API requests and parsing the returned JSON data, we can access up-to-date information on the ISS's location and pass times.

This kind of real-time data access is crucial for many AI use cases. For example, we could use the ISS pass time data to train a machine learning model to predict future pass times for any given location. This would require building a data pipeline to regularly fetch the latest data from the API, preprocess it, and feed it into the model.

As you can see, understanding how to work with APIs and parse JSON responses is a foundational skill for data science and AI applications. The ability to integrate data from multiple sources via APIs expands the universe of what's possible with AI and machine learning.

Ready to get started? Check out our interactive APIs and Webscraping course in Python, if you already have Python skills. If not, enroll in our free Python Fundamentals course if you're interested in data science applications. If you want to work with AI, check out our Generative AI Fundamentals in Python skill path.

Charlie Custer

About the author

Charlie Custer

Charlie is a student of data science, and also a content marketer at Dataquest. In his free time, he's learning to mountain bike and making videos about it.