Fundamentals of String Manipulation

In this course and the last, you've learned to write R code to manipulate data in a variety of ways: Using arithmetic and comparison operators to perform calculations, using control structures and functions to execute operations based on conditions, and repeating operations on elements of a dataframe or list using loops and functionals.

You've mainly learned to use these techniques to work with numeric data. However, data analysts and scientists frequently need to work with sequences of characters, otherwise known as string data, as well. Working with strings in R will come in handy often. Sometimes you'll need to extract part of a string, such as a name, date, or time that's buried in a longer character string.

In this lesson, you'll learn to work with the sting components of FiveThirtyEight's data on the 2014 FIFA World Cup. You'll work on performing string manipulations, as well as reviewing data analysis techniques including dataframe manipulations and using built-in functions for solving split-apply-combine problems.

As you learn to manipulate strings, you'll work with a new tidyverse package: stringr. The stringr package contains tools for combining, splitting, adding, and removing spaces from, and performing other useful manipulations with string data.


  • Learn to subset and pad strings.
  • Learn to manipulate strings to create new variables for data analysis.
  • Learn to combine strings to create data summaries.

Lesson Outline

1. Working With Strings
2. Subsetting Strings by Position
3. Splitting Strings
4. Calculating Average Goals Per Month
5. Combining Strings
6. String Manipulations for Reformatting Match Dates
7. Padding Strings
8. Creating New Variables
9. Combining Strings to Create Match Summaries
10. Next Steps
11. Takeaways

Take a Look Inside