In the last lesson, we learned that the chi-squared test is a statistical hypothesis testing technique and looked at the gender frequencies of people included in a dataset on US income and calculated a chi-squared value indicating how the observed frequencies in a single categorical column varied from the US population as a whole.
In this mission, we'll look at how to make this same technique applicable to cross tables that show how two categorical columns interact. In other words, we'll look at how to apply the chi-squared test across more than one category at a time.
You'll learn concepts such as expected value and statistical significance. In order to aid you in implementing the chi-squared test across multiple columns, you will use R's built-in chi-squared test function. Given that R is a sophisticated programming language for statistics, R comes with the `chisq.test()` function so you can easily compute the chi-squared statistic when you're doing a multi category chi-squared tests.
As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.
1. Multiple categories
2. Calculating expected values
3. Calculating the chi-squared statistic
4. Calculating degrees of freedom
5. Calculating p-value
6. R's built-in chi-squared function
8. Next steps