Guiding students through using mean, median, mode, and standard deviation
An instructor's guide to teaching introductory descriptive statistics
Robyn Gotz (Montana State University, Bozeman)
Sonia Nagorski (University of Alaska Southeast, Juneau)
What should students get out of this module?
After completing this module, a student should be able to:
- explain when it would be appropriate to use mean, median, mode, and standard deviation to describe a data set
- describe the steps needed to calculate the mean, median, mode, and standard deviation
- explain the assumptions and limitations of mean, median, mode, and standard deviation
Why are these math skills challenging to incorporate into courses?
Students often struggle with even the most fundamental of statistical descriptions of a data set. Students may not have taken a statistics class prior to enrolling in a geoscience course, while others have worked through multiple statistics courses. Thus, it can be challenging to assign statistical problems to an unevenly prepared classroom. Although most students have heard of the words "average" or "mean," many haven't had to calculate it themselves or to consider the skewness of data and the appropriateness of mean vs. median vs. mode. Another problem is that many students plug values into online sites without understanding how calculations are made.
Many geoscience students, particularly non-majors, have anxiety about math and may be resistant to attempting quantitative problems. However, incorporating these concepts and showing real-world applications to geoscience situations can improve students' skills and understanding in both geoscience and mathematics, and it may deepen their trust in the scientific process. It is also a skill that is widely expected by employers and in graduate programs.
What we don't include in the page?
- In-depth discussion of different types of distributions (
- Detailed explanation of how to select the appropriate statistic to the dataset at hand. Selecting the appropriate statistical representation of a dataset is a nuanced and complicated choice, and well beyond the scope of this module.
- Other measures of variability such as variance, standard error, and root mean squared error
- The difference between the standard deviation of a population or a sample. We have used the population standard deviation throughout this page. An explanation of the difference is here: https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/variance-standard-deviation-sample/a/population-and-sample-standard-deviation-review
- Statistical tests, including tests that compare mean or median values
- Arithmetic vs geometric mean
- Explanation of non-parametric measures of variability such as using the interquartile range around a median
Instructor resources
Support for teaching this quantitative skill
- Spreadsheet Warm Up for SSAC Geology of National Parks Modules is a Spreadsheets Across the Curriculum module that introduces students to electronic spreadsheets as a tool for elementary calculations. The module covers some basics, including the components of a spreadsheet, the necessity of an equals symbol for cell formulas, how the mathematical concept of function applies to spreadsheets, and a few mechanical things, such as copying and pasting.
- Several relevant sections of Introductory Statistics, a text shared under a CC by 4.0 license, may be helpful:
- Measures of the Center of the Data describes basic descriptive statistics such as mean, median, and mode and includes example problems.
- Measures of the Spread of the Data describes standard deviation and includes example problems that use standard deviation to help illustrate the variability in different data sets.
- Skewness and the Mean, Median, and Mode describes a normal, symmetrical distribution and illustrates why mean, median, and mode can be equal in such cases, followed by examples of skewed distributions and how the mean, median, and mode shift accordingly.
Examples of activities that use this quantitative skill
- What Does the Mean Mean? Describing Eruptions at Riverside Geyser, Yellowstone National Park This module by Tom Juster is set at the Old Faithful geyser in Yellowstone, and students work with data on eruption intervals. The module uses Excel. Students build a histogram and find a bimodal distribution of the data. Students work with the mean, median, and mode of the eruption timings, and they discover that the mean (or median) value seldom occurs.
- Nitrate Levels in the Rock Creek Park Watershed, Washington DC, 1: Measures of Central Tendency is a Spreadsheets Across the Curriculum activity, which is the first of a two-part series using water-quality data from Rock Creek Park in Washington DC to illustrate elementary statistical measures and exploratory data analysis. The data set illustrates how mode, median, and mean fall out in a positively skewed distribution.
- Nitrate Levels in the Rock Creek Park Watershed, Washington DC, 2: Variability is the second of a two-part series illustrating elementary statistical measures of exploratory data analysis in the context of water-quality data for a stream in an urban park. This module focuses on measures of variability (variance, standard deviation), which are used to examine an outlier value – a nitrate measurement twice as large as the next highest values.