How to Teach using Data Simulations

Research that examines the use of simulations on student outcomes suggests that even 'well-designed' simulations are unlikely to be effective if the interaction with the student isn't carefully structured (Lane & Peres, 2006). Consequently, how simulations are used is of great importance.


Simulations can involve physical materials (drawing items from a bag, tossing coins, sampling candies) or they can involve generating data on the computer (drawing samples from a population or generating data based on a probability model). Even when using computer simulations, Rossman and Chance (2006) suggest always beginning with a concrete simulation (e.g., having students take random samples of words from the Gettysburg address before taking simulated samples using their Sampling Words applet, or having students take physical samples of Reese's Pieces candies before using a web applet to simulate samples of candies).


Effective Ways to Use Simulations

Regardless of whether the simulation is based on concrete materials, a computer program, or a web applet there are some suggested ways to use simulations to enhance students learning. These include:
  • Give students a problem to discuss, ask them to make a prediction about the answer, then simulate data to test their predictions (e.g., predicting the average family size if a country adopts a One Son policy).
  • Ask students to predict what will happen under certain conditions, then test it out (e.g., what will happen to the shape of a sampling distribution if the sample size is increased).
  • Ask students to come up with rules for certain phenomena (e.g., what factors affects the width of a confidence interval and why).
  • Ask students to create a model and use it to simulate data to test whether a particular outcome is due to chance or do to some other factor (e.g., simulate data for outcomes of fair coin tosses and use it to test whether a coin when balanced on its side is just as likely to land heads up or heads down).
  • Ask students to run a simulation to discover an important idea (e.g., take random samples of words and create a distribution of mean word lengths, to compare to a distribution of mean words lengths generated by judgmental samples taken by students, to learn that only random samples will be representative of the population).


Cautions about Using Simulations with Students

Here are some practical considerations to keep in mind when designing or using activities involving simulations.
  • The best designed simulation will be ineffective if students are not engaged or get lost in the details and direction. Assigning students to groups with designated roles when using an activity involving simulation can help students divide up the work, where one student reads directions and another enters or analyzes data.
  • It is important to structure good discussions about the use and results of simulations to allow students to draw appropriate conclusions. Designing questions that promote reflection or controversy can lead to good discussions. Also, having students make predictions which are tested can lead students to discuss their reasoning as they argue for different predicted results.
  • Select technology that facilitates student interaction and is accessible for students. It is crucial that the focus remains on the statistical concept and not on the technology. Consequently, technology should be chosen in light of the students' backgrounds, course goals, and teacher knowledge.
  • Select technology tools that allow for quick, immediate, and visual feedback. Examples of technology that have been found especially useful here are Fathom Software, Java applets, and Sampling SIM software, (more information).
  • Integrate the simulations throughout the course. This allows students to see simulation as a regular tool for analysis and not just something for an in-class activity.


Using a Visual Model to Illustrate the Simulation Process

Keeping track of populations (or random variables), samples, and sample statistics can be confusing to students when running certain simulations. It can be useful to use a graphical diagram to illustrate what is happening when simulating data, helping students to distinguish between population, samples, and distributions of sample statistics. The Simulation Process Model (Lane-Getaz, 2006b) can be used for this purpose.


The Simulation Process Model (Microsoft Word 64kB Oct15 06) provides a framework for students to develop a deeper understanding of the simulation process through visualization. The first tier of the model represents the population and its associated parameter. The second tier of the model represents a given number of samples drawn from the population and their associated statistics. The third tier of the model represents the distribution of sample statistics (i.e., the sampling distribution).


The Sampling Reese's Piece activity provides a good example of how the Simulation Process Model might be used. In this activity students use an applet to simulate samples of candies, while a graph of the distribution of orange candies is dynamically generated. The population of candies (shown in a candy machine) would be the first tier of the model. Multiple random samples of 25 candies and the proportion of orange candies in each sample make up the second tier of the model. Finally, the distribution of the sample proportions of oranges candies make up the third (and bottom) tier of the model. Sharing this model with students after they complete the simulation activity can help them better understand the simulation and distinguish between the different levels of data.


Examples of Simulation Activities

Generating Data by Specifying a Probability Model

In the One Son Policy simulation students are first presented with a research question about the consequences of the one son policy, where families continue to have children until they have one boy, then they stop. Students are then asked to make conjectures about the average family size and ratio of boys to girls under this policy. Then students simulate this policy, with coins and a computer applet. Students then compare their conjectures to their observed results. Through this simulation students gain a deeper understanding of the processes associated with probability models.


Hypothesis Testing and Inference

In the Coke vs. Pepsi Taste Test Challenge students first design and conduct an experiment where students participate in a blind taste test. Students collect and analyze data on whether their peers can detect the difference in colas, using simulation to generate data to compare their results to.


Sampling from a Population

In the Reese's Pieces Activity students first make a prediction about the proportion of orange Reese's Pieces in the population of Reese's Pieces candy, then randomly sample 25 candies and record the proportion of oranges candies, then simulate data using an applet.


Assessment of Learning after a Simulation Activity

There are different ways student learning can be assessed after using a simulation activity. These include:
  • Assessing students' understanding of what the simulation is illustrating. For example, do students now understand the meaning of a 95% confidence interval.
  • Assessing whether students can apply their learning to a different problem or context, such as critiquing a research finding that includes a confidence interval, where students interpret the correct use or misuse of the term margin of error.
  • Assessing if students understand the simulation process. For example, students can be given the Simulation Process Model and be asked to map the different levels of data from a simulation activity to the three tiers of the model.