In Statistics Fundamentals, we completed the workflow illustrated below. We learned to use frequency distribution tables to bring the data to a comprehensible form to find patterns. Frequency tables, however, are not the only way of bringing data to a comprehensible form.
Throughout this lesson, we’ll learn how to summarize the distribution of a variable a single value: the mean. Although we already learned about the mean in previous courses of the data scientist path, we discuss the concept again here to give the explanations much more depth.
You’ll learn how to define the mean algebraically, the difference between population mean and sample mean, and discover why the mean is the balance point of a distribution.
While exploring how the mean can be used to summarize a distribution, we’ll be working with a dataset that describes characteristics of houses sold between 2006 and 2010 in Ames, Iowa.
As you work through each concept, you’ll apply what you’ve learned from within your browser; there’s no need to use your own machine to do the exercises. The Python environment inside of this course includes answer-checking to ensure you’ve fully mastered each concept before moving on to the next.
- Learn why the mean is the balance point of a distribution.
- Learn how to distinguish between the sample and the population mean.
- Learn why the sample mean is an unbiased estimator.
- The Mean
- The Mean as a Balance Point
- Defining the Mean Algebraically
- An Alternative Definition
- Introducing the Data
- Mean House Prices
- Estimating the Population Mean
- Estimates from Low-Sized Samples
- Variability Around the Population Mean
- The Sample Mean as an Unbiased Estimator
- Next steps