Course overview
In this course, you’ll learn several techniques for sampling data, such as random sampling and cluster sampling. You’ll also learn about discrete variables and random variables in the context of frequency distributions, and the different types of charts and graphs you might use to visualize frequency distributions.
As you learn about these concepts and how to use them for more robust data analysis, you’ll be working with a dataset about basketball players in the WNBA (Women’s National Basketball Association) that contains general information about players, along with their metrics for the 2016-2017 season.
Best of all, you’ll learn by doing — you’ll practice and get feedback directly in the browser. At the end of the course, you’ll complete a portfolio project that asks you to investigate Fandango Movie Ratings to determine if Fandango is inflating movie ratings on its site. This is an opportunity to learn to identify and overcome common setbacks in practical data analysis.
Key skills
- Sampling data using simple, random sampling, stratified sampling, and cluster sampling
- Employing and measuring variables in statistics
- Building, visualizing, and comparing frequency distribution tables
Course outline
Introduction to Statistics in R [7 lessons]
Stratified Sampling and Cluster Sampling 2h
Lesson Objectives- Employ sampling methods
- Perform stratified sampling and cluster sampling
Variables in Statistics 1h
Lesson Objectives- Define variables in statistics
- Identify different kinds of variables
- Measure variables
Frequency Distributions 1h
Lesson Objectives- Define frequency distributions
- Generate frequency distribution
- Generate grouped frequency distribution tables
- Define proportions, percentages, and percentiles
Visualizing Frequency Distributions 1h
Lesson Objectives- Identify the importance of visualizing distributions
- Generate bar charts and histograms
- Employ bar plots and histograms
Comparing Frequency Distributions 1h
Lesson Objectives- Compare frequency distributions
- Define grouped bar plots
- Define overlaid histograms
- Define kernel density estimate plots
- Define scatter plots and box plots
Guided Project: Investigating Fandango Movie Ratings 1h
Lesson Objectives- Expand your portfolio with a guided statistics project
- Overcome common setbacks in data analysis
Projects in this course
Investigating Fandango Movie Ratings
For this project, you’ll be a data journalist analyzing Fandango’s movie ratings to determine if there was any change after a 2015 analysis found evidence of bias. You’ll use R and statistics skills to compare movie ratings data from 2015 and 2016.
The Dataquest guarantee
Dataquest has helped thousands of people start new careers in data. If you put in the work and follow our path, you’ll master data skills and grow your career.
We believe so strongly in our paths that we offer a full satisfaction guarantee. If you complete a career path on Dataquest and aren’t satisfied with your outcome, we’ll give you a refund.
Master skills faster with Dataquest
Go from zero to job-ready
Learn exactly what you need to achieve your goal. Don’t waste time on unrelated lessons.
Build your project portfolio
Build confidence with our in-depth projects, and show off your data skills.
Challenge yourself with exercises
Work with real data from day one with interactive lessons and hands-on exercises.
Showcase your path certification
Share the evidence of your hard work with your network and potential employers.