MISSION 354

Regular Expression Basics

In our Regular Expression Basics mission, you will learn what regular expressions are and how you can perform some advanced data cleaning on your datasets as you prepare them for analysis for your data science projects and other Pythonic needs. You will learn the fundamentals of regular expressions and how they can be used for more powerful string manipulation. You will also learn concepts such as character classes, quantifiers, capture groups, positional anchors, and more. 

In addition to learning about regular expression concepts, you will also learn how to use regular expressions in your Python and pandas code using the re module. The re allows you to perform operations with regular expressions such as searching and replacing text patterns with other text patterns, as well as other operations involving regular expressions. 

In this mission, you will work with data from Hacker News to give a thorough overview of regular expressions and how powerful they can be in your data cleaning tasks. Because you'll be working with some real-world data, you will get the opportunity to think like a data analyst or data scientist as you explore a dataset.

By the end of this mission, you will have a working knowledge of regular expressions and how to use them to do some powerful string manipulation.

Objectives

  • Learn what regular expressions are and how to use them.
  • Learn regex compnents like character classes and quantifiers.
  • Learn how to use Regular Expressions with the 're' module and pandas.

Mission Outline

1. Introduction
2. The Regular Expression Module
3. Counting Matches with pandas Methods
4. Using Regular Expressions to Select Data
5. Quantifiers
6. Character Classes
7. Accessing the Matching Text with Capture Groups
8. Negative Character Classes
9. Word Boundaries
10. Matching at the Start and End of Strings
11. Challenge: Using Flags to Modify Regex Patterns
12. Next Steps
13. Takeaways

python-data-cleaning-advanced

Course Info:

Intermediate

The median completion time for this course is 7 hours. View Details

This course requires a basic subscription and includes four missions. It is the sixth course in the Data Analyst in Python Path and Data Scientist in Python Path

START LEARNING FREE

Take a Look Inside