Although there are many datasets available in convenient formats, there’s also a lot of data that’s more difficult to access, like a table on a web page. To get this data, we’ll need to use web scraping. In R, we can do that with the rvest scraping package.
In this course, you’ll learn about web page structure, including the basics of HTML and CSS. You’ll also learn how to get the code from a page into your R workflow for further parsing and cleaning. Then, you’ll dig deeper into scraping, learning to use the CSS Selector to get precisely the data you want.
Best of all, you’ll learn by doing — you’ll practice and get feedback directly in the browser. At the end of the course, you’ll complete a guided project that asks you to use web scraping to analyze movie ratings.
- Scraping data from a web page using R and rvest
- Parsing sites using the CSS Selector
- Advanced web scraping techniques
Introduction to Web Scraping in R [4 lessons]
Projects in this course
The Dataquest guarantee
Dataquest has helped thousands of people start new careers in data. If you put in the work and follow our path, you’ll master data skills and grow your career.
We believe so strongly in our paths that we offer a full satisfaction guarantee. If you complete a career path on Dataquest and aren’t satisfied with your outcome, we’ll give you a refund.