What is Teaching with Data Simulations?
Teaching with data simulations means giving students opportunities to simulate data in order to answer a particular research question or solve a statistical problem. There are several ways to use simulations. These include:
- Physical simulations of a process (e.g., taking repeated samples of Reese's pieces candies to simulate a sampling distribution for the proportion of orange candies, or tossing coins to model births in order to estimate the effect of a "One-Son" policy on average family size).
- Simulating a game or situation to estimate the chances of certain outcomes (e.g., playing Let's Make a Deal to estimate the chances of winning using two different strategies, to determine which is the better strategy).
- Using a probability model to simulate data to estimate the chance of a particular outcome (e.g., the chance of getting "three of a kind" when dealing five cards, or the chance of getting five heads on six tosses of a fair coin).
- Simulating data while varying parameters to illustrate a concept or deepen students' understanding of a process (e.g., simulating confidence intervals from different populations while varying sample size, level of confidence, or standard deviation).
- Using simulation to generate data under a certain theory to test whether a particular outcome is surprising (e.g., if a student correctly identifies 8 out of 10 samples of cola correctly in a blind taste test. Simulation can be used to determine if this result could just be due to chance/guessing by generating data based on what would be expected if the person is guessing, and comparing their result to a simulated sampling distribution).
Illustrating Difficult and Abstract Concepts
While simulating data can have many functions in the practice of statistics, it is especially helpful in the classroom to facilitate student understanding of concepts that have traditionally been very difficult to learn. Probability is an aspect of statistics education that students have traditionally found difficult to grasp. Simulations allow students to visualize probability distributions, which in turn can make the processes associated with probability more concrete. Likewise, hypothesis testing and inference are important and difficult areas of statistics. As noted above, the use of simulations can facilitate a deeper and more concrete understanding of hypothesis testing and inference by providing visual distributions of data to compare a sample result to, or illustrating how many simulated confidence intervals do or do not capture the true population mean. Similarly, sampling distributions and understanding the processes associated with sampling from a population can be difficult for students and consequently using simulations can facilitate greater student understanding of sampling and the Central Limit Theorem.