MISSION 54

Web Scraping

In this lesson, we’ll look at one of the best ways of getting unique data sets: web scraping. In the previous lessons, we have been working with APIs. 

In the Working with APIs lesson, you discovered the advantages of using APIs to get data. A lot of data, however, isn't accessible through data sets or APIs, but it exists on some web page or collection of pages. One way to access the data without waiting for the provider to create an API or copying it manually by yourself is to use a technique called web scraping.

In this lesson, we'll discover how to use web scraping to extract the data we want from a web page using Python and the beautifulsoup library. We will also see the structure of a web page and use some basic HTML and CSS skills to aid us in web scraping.

We'll use the requests library heavily as we learn about Web scraping. This library enables us to download web pages. The beautifulsoup library will also be very important, as it allows us to more easily to extract the relevant parts of each web page we download to get just the data that we want, without all of the superfluous code and other elements that might be present on the web page. 

You’ll be able to do this web scraping and make use of both libraries from right within our browser-based platform, which means there’s no download or setup time required. Dive in, and you’ll be learning web scraping in Python in less than a minute!

Objectives

  • Learn how web pages are structured using HTML.
  • Learn the basics of web scraping.

Mission Outline

1. Introduction
2. Web Page Structure
3. Retrieving Elements from a Page
4. Using Find All
5. Element IDs
6. Element Classes
7. CSS Selectors
8. Using CSS Selectors
9. Nesting CSS Selectors
10. Using Nested CSS Selectors
11. Beyond the Basics
12. Takeaways

apis-and-scraping

Course Info:

Intermediate

The median completion time for this course is 6.2 hours. ​View Details​​​

This course requires a basic subscription and includes four missions.  It is the thirteenth course in the Data Analyst in Python path and the Data Scientist Path.

START LEARNING FREE

Take a Look Inside