Learn R the Right Way in 5 Steps
R is an increasingly popular programming language, particularly in the world of data analysis and data science. You may have even heard people say that it's easy to learn R! But easy is relative. Learning R can be a frustrating challenge if you’re not sure how to approach it.
If you’ve struggled to learn R or another programming language in the past, you’re definitely not alone. And it’s not a failure on your part, or some inherent problem with the language.
Usually, it’s the result of a mismatch between what’s motivating you to learn and how you’re actually learning.
This mismatch causes big problems when you’re learning any programming language, because it takes you straight to a place we like to call the cliff of boring.
What is the cliff of boring? It’s the mountain of boring coding syntax and dry practice problems you’re generally asked to work through before you can get to the good stuff — the stuff you actually want to do.
The cliff of boring is a metaphor, but it really can feel like you're looking at this sometimes.
Nobody signs up to learn a programming language because they love syntax. Yet many learning resources, from textbooks to online courses, are written with the idea that students need to master all of the key areas of R syntax before they can do any real work with it.
This is the process that causes new learners to drop off in droves:
- You get excited about learning a programming language because you want to do something with it.
- You try to start learning and are immediately led to this huge wall of complicated, boring stuff.
- You struggle through some of the boring stuff with no idea how it relates to the thing you actually want to do.
Is it any wonder that many people quit when this is the default learning experience?
Don't misunderstand me — there’s no way around learning syntax, in R or any other programming language.
But there is a way to avoid the cliff of boring.
It’s a shame that so many students drop off at the cliff, because R is absolutely worth learning! In fact, R has some big advantages over other language for anyone who’s interested in learning data science:
- The R tidyverse ecosystem makes all sorts of everyday data science tasks very straightforward.
- Data visualization in R can be both simple and very powerful.
- R was built to perform statistical computing.
- The online R community is one of the friendliest and most inclusive of all programming communities.
- The RStudio integrated development environment (IDE) is a powerful tool for programming with R because all of your code, results, and visualizations are together in one place. With RStudio Cloud you can program in R using RStudio using your web browser.
And of course, learning R can be great for your career. Data science is a fast-growing field with high average salaries (check out how much your salary could increase).
And tons of companies and organizations use R for data science work! Here's a very short sample of some of the companies using R (from Hired.com as of April 2021):
- SpaceX
- Starbucks
- Fitbit
- Kraft Heinz
- Hulu
- Amazon
- iRobot
- Ubisoft
- Allstate
- Twitch
- AT&T
- Salesforce
- Pfizer
- General Motors
- Northrop Grumman
- Ralph Lauren
- Goldman Sachs
This list is just the tip of the iceberg — thousands and thousands of companies all across the globe hire people with R skills, and R is very in demand in academia and government, as well. Even from this short list, it's clear that someone with R skills could work in almost any industry they wanted.
Big tech, finance, video games, big pharma, insurance, fashion — every industry needs people who can work with data, and that means that every industry has use for R programming skills.
So how can you get them?
Step 1. Find Your Motivation for Learning R
Before you crack a textbook, sign up for a learning platform, or click play on your first tutorial video, spend some time to really think about why you want to learn R, and what you’d like to do with it.
- What data are you interested in working with?
- What projects would you enjoy building?
- What questions do you want to answer?
Find something that motivates you in the process. This will help you define your end goal, and it will help you get to that end goal without boredom.
Try to go deeper than “becoming a data scientist.” There are all kinds of data scientists who work on a huge variety of problems and projects. Are you interested in analyzing language? Predicting the stock market? Digging deep into sports statistics? What’s the thing you want to do with your new skills that’s going to keep you motivated as you work to learn R?
Pick one or two things that interest you and that you’re willing to stick with. Gear your learning towards them and build projects with your interests in mind.
Figuring out what motivates you will help you figure out an end goal, and a path that gets you there without boredom. You don’t have to figure out an exact project, just a general area you’re interested in as you prepare to learn R.
Pick an area you’re interested in, such as:
- Data Science / Data Analysis
- Data visualization
- Predictive modeling / machine learning
- Statistics
- Reproducible reports
- Dashboard reports
Create three-dimensional data visualizations in R with rayshader
Step 2. Learn the Basic Syntax
Unfortunately, there’s no way to completely avoid this step. Syntax is a programming language is even more important than syntax in human language. If someone says “I’m the store going to,” their English-language syntax is wrong, but you can probably still understand what they mean. Unfortunately, computers are far less forgiving when they interpret your code.
However, learning syntax is boring, so your goal must be to spend as little time as possible doing syntax learning. Instead, learn as much of the syntax as you can while working on real-world problems that interest you so that there’s something to keep you motivated even though the syntax itself isn’t all that exciting.
Here are some resources for learning the basics of R:
- Codecademy — does a good job of teaching basic syntax.
- Dataquest: Introduction to R Programming — We built Dataquest to help data science students avoid the cliff of boring by integrating real-world data and real data science problems right off the bat. We think learning the syntax in the context of working on real problems makes it more interesting, and our interactive platform challenges you to really apply what you’re learning, checking your work as you go.
- R for Data Science — One of the most useful resources for learning R and tidyverse tools. Available in print from O’Reilly or for free online.
- RStudio Education - RStudio is the most popular integrated development environment (IDE) for programming with R. Their education page for beginners contains useful resources including tutorials, books, and webinars.
- RStudio Cloud Primers - Start coding in R without installing any software with cloud-based tutorials from RStudio.
The quicker you can get to working on projects, the faster you will learn R. You can always refer to a variety of resources for learning and double-checking syntax if you get stuck later. But your goal should be to spend a couple of weeks on this phase, at most.
The RStudio Cheatsheets are great reference guides for R syntax:
Step 3. Work on Structured Projects
Once you’ve got enough syntax under your belt, you’re ready to move on to structured projects more independently. Projects are a great way to learn, because they let you apply what you’ve already learned while generally also challenging you to learn new things and solve problems as you go. Plus, building projects will help you put together a portfolio you can show to future employers later down the line.
You probably don’t want to dive into totally unique projects just yet. You’ll get stuck a lot, and the process could be frustrating. Instead look for structured projects until you can build up a bit more experience and raise your comfort level.
If you choose to learn R with Dataquest, this is built right into our curriculum — nearly every one of our data science courses ends with a guided project that challenges you to synthesize and apply what you’re learning. These projects provide some structure, so you’re not totally on your own, but they’re more open-ended than regular course content to allow you to experiment, synthesize your skills in new ways, and make mistakes.
If you’re not studying with Dataquest, there are plenty of other structured projects out there for you to work on. Let’s look at some good resources for projects in each area:
Data science / Data analysis
- Dataquest — Teaches you R and data science interactively. You analyze a series of interesting datasets ranging from CIA documents to WNBA player stats.
- R for Data Science - by Hadley Wickham and Garrett Grolemund is an excellent R resource with motivating and challenging exercises.
- TidyTuesday - A semi-structured, weekly social data project in R where budding r practitioners clean, wrangle, tidy, and plot a new dataset every Tuesday. New datasets are posted weekly. Results are shared on Twitter using the hashtag #tidytuesday.
Data visualization
- ggplot2 - One of the most popular tools for data visualization in R is the ggplot2 package. The Data visualisation chapter from R for Data Science is a great place to learn the basics of data visualization with ggplot2. The chapter on Graphics for communication is a great resource for making graphics look more professional.
- rayshader - build two-dimensional and three-dimensional maps in R with the rayshader package. You can also transform graphics developed with ggplot2 into 3D with rayshader.
Predictive modeling / machine learning
- Get Started with Tidymodels - a series of articles that teach tidymodels, a collection of packages for modeling and machine learning using tidyverse principles.
- Create Robust Models with Tidymodels - build and train predictive models with this series of projects.
- Tune, Compare and Work With Models - use methods such as grid search, nested resampling, and Bayesian methods using tidymodel tools.
- Develop Custom Modeling Tools - enhance and customize your models with these tutorials.
Statistics
- Perform Statistical Analysis with Tidymodels - a series of more advanced articles using tidymodels for statistical analysis.
Reproducible reports
- Getting Started with R Markdown — Guide - build your own R Markdown reference guide with this free tutorial from Dataquest. Improve your R Markdown skills by documenting any project described here with R Markdown.
- R Markdown Cookbook - is a comprehensive, free online book that contains almost everything you need to know about R Markdown.
- R Markdown: The Definitive Guide - another great, free resource for learning R Markdown.
Dashboard reports
- Shiny Dashboard Tutorials - make dashboards in R with shiny dashboards using these tutorials from RStudio.
- Shiny Gallery - check out this gallery from RStudio for some Shiny Dashboard inspiration and examples.
Step 4. Build Projects on Your Own
Once you’ve finished some structured projects, you’re probably ready to move on to the next stage of learning R: doing your own unique data science projects. It’s hard to know how much you’ve really learned until you step out and try to do something by yourself. Working on unique projects that interest you will give you a great idea not only of how far you’ve come but also of what you might want to learn next.
And although you’ll be building your own project, you won’t be working alone. You’ll still be referring to resources for help and learning new techniques and approaches as you work. With R in particular, you may find that there’s a package dedicated to helping with the exact sort of project you’re working on, so taking on a new project sometimes also means you’re learning a new R package.
What do you do if you get stuck? Do what the pros do, and ask for help! Here are some great resources for finding help with your R projects:
- StackOverflow — Whatever your question is, it has probably been asked here before, and if it hasn’t, you can ask it yourself. You can find questions tagged with R here.
- Google — Believe it or not, this is probably the most commonly-used tool of every experienced programmer. When you encounter an error that you don’t understand, a quick Google search of the error message will often point you towards the answer.
- Twitter — It may be surprising to learn, but Twitter is an excellent resource getting help on R-related issues. Twitter is also a great resource for R-related news and updates from the world's leading R practitioners. The R community on Twitter is centralized around the #rstats hashtag.
- Dataquest’s Learning Community — With a free student account you can join our learning community and ask technical questions that your fellow students or Dataquest’s data scientists can answer.
What sorts of projects should you build? As with the structured projects, these projects should be guided by the answers you came up with in step 1. Work on projects and problems that interest you. If you’re interested in climate change, for example, find some climate data to work with and start digging around for insights.
It’s best to start small rather than trying to take on a gigantic project that will never get finished. If what interests you most is a huge project, try to break it down into smaller pieces and tackle them one at a time.
Here are some ideas for projects that you can consider:
- Expand on one of the structured projects you built before to add new features or deeper analysis.
- Go to meetups or hook up with other R coders online and join a project that’s already underway.
- Find an open-source package to contribute to (R has tons of great open source packages!)
- Find an interesting project someone else made with R on Github and try to extend or expand on it. Or, find a project someone else made in another language and try to recreate it using R.
- Read the news and look for interesting stories that might have available data you could dig into for a project.
- Check out our list of free data sets for data science projects and see what available data inspires you to start building!
Here are some more project ideas in the topic areas that we've discussed:
Data science / Data analysis
- A script to automate data entry.
- A tool to scrape data from the web.
Data Visualization
- A map that visualizes election polling by state, or region.
- A collection of plots that depict the real-estate sale or rental trends in your area.
Predictive modeling / machine learning
- An algorithm that predicts the weather where you live.
- A tool that predicts the stock market.
- An algorithm that automatically summarizes news articles.
Statistics
- A model that predicts the cost of a Uber trips in your area.
Reproducible reports
- A report of Covid-19 trends in your area in an R Markdown report that can be updated when new data becomes available.
- A summary report of performance data for your favorite sports team.
Dashboard reports
- A map of the live locations of buses in your area.
- A stock market summary.
- A Covid-19 tracker, like this one.
- A summary of your personal spending habits.