In this first lesson of our course on strings, dates and times in R, you'll learn about some new functions as you start working with text data (strings).

In this lesson, we'll go through a detailed review of what strings are and and what we can do with them. Strings are how R represents text, whether we are talking about tweets, electronic medical records or Amazon reviews. Compared to numerical data, text data can be harder to clean and manipulate since there are many, many words. We also can't typically do numerical calculation directly with text; we need to perform some cleaning steps beforehand.

Typical work with strings usually has to do with finding specific words or phrases in a larger paragraph, or counting the number of times it appears in a passage. In this lesson, we'll cover techniqyes to accomplish that, including indexing strings, string trimming and padding, splitting strings, string concatenation, regular expressions in R, string replacement, and more!

By the end of this lesson, we'll know how to use many of the functions that help us with string manipulation, which will open up our capabilities as a programmer.

Objectives

  • Learn about strings and how they're indexed in R.
  • Manipulate strings using a variety of R programming techniques.
  • Clean real-world text data using R.

Lesson Outline

  1. Introduction
  2. Indexing Strings
  3. Handling Word Casing
  4. String Trimming & Padding
  5. String Splitting
  6. String Concatenation
  7. Regular Expressions
  8. String Detection
  9. String Replacement
  10. Next Steps