Nitrate Levels in the Rock Creek Park Watershed, Washington DC, 2: Variability

Module by: Mark C. Rains and Len Vacher, University of South Florida

Marian Norris, National Parks Service, Center for Urban Ecology

Cover Page by: Len Vacher and Denise Davis, University of South Florida


This material was originally developed by Spreadsheets Across the Curriculum as part of its collaboration with the SERC Pedagogic Service.

Summary

This Spreadsheets Across the Curriculum activity is the second of a two-part series illustrating elementary statistical measures of exploratory data analysis in the context of water-quality data for a stream in an urban park. The first module focused on measures of central tendency for nitrate and total phosphorus data. This module focuses on measures of variability (variance, standard deviation), which are used to examine an outlier value – a nitrate measure twice as large as the next highest values. Students work with the entire nitrate dataset (2006-2007), calculate variance and standard deviation from their definitions, calculate them again from the built-in functions, determine the z-scores for each measurement, plot the scatter of z-score vs. measured value to see the distribution, and determine that yes, indeed, the high value is an outlier. Without taking a position of the cause of the high value (unusual event or technical error), students explore its effect on the summary statistics they have calculated in the two modules.

This material is based upon work supported by the National Science Foundation under Grant Number NSF DUE-0836566. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Used this activity? Share your experiences and modifications

Learning Goals

SSACgnp.TD376.LV1.10-Slide 5

Students will:

  • Examine an outlier in the nitrate data they plotted in the first module of this two part series.
  • Calculate variance and standard deviation of the nitrate deviation two ways: brute force from the definitions, and the easy way using Excel's built-in function.
  • Calculate z-scores for the nitrate values
  • Do a site calculation using the normal frequency distribution and the normal distribution function to determine the probability of measurements within one, two and three standard deviations of the mean (68, 95, and 99.7%, respectively) for normally distributed data.
  • See visually that these data are not normal (the data are positively skewed), but nevertheless calculate that the z-score of the outlier value is impressive (more than 6).
  • Consider the effect on the summary statistics (mean, median, mode, range, variance, standard deviation) if the outlier were disregarded

In the process the students will:

  • Get familiar with the variance and standard deviation and see some uses for them (expressions of variability; a way of expressing distance from the mean; a way to evaluate an outlier)

  • See that range and mean are sensitive to outliers.

  • Learn where the magical 68, 95, and 99.7% values come from, and that not all distributions are normal.

Context for Use

SSACgnp.TD376.LV1.10-Slide 12

This module is designed for potential use in the Geology of National Parks service course at USF. The course is offered as an online course every semester. It includes readings from Parks and Plates, weekly quizzes based on that textbook, and weekly student activities designed to align the course with the University's general education requirements. This module is intended to be one of those activities, with the specific goal of meeting the gen-ed quantitative literacy dimension.


Description and Teaching Materials

SSACgnp.TD376.LV1.10-Slide 15

The module is a PowerPoint presentation with embedded spreadsheets. Click on the link below to download a copy of the module.

Optimal results are achieved with Microsoft Office 2007 or later; the module will function in earlier versions with slight cosmetic compromises. If the embedded spreadsheets are not visible, save the PowerPoint file to disk and open it from there.

The above PowerPoint presentation file is the student version of the module. It includes a template for students to use to complete the spreadsheet(s) and answer the end-of-module questions, and then turn in for grading.

An instructor version is available by request. The instructor version includes the completed spreadsheet. Send your request to Len Vacher (vacher@usf.edu) by filling out and submitting the Instructor Module Request Form.

Teaching Notes and Tips

The module is constructed to be a stand-alone resource. It can be used as a homework assignment, lab activity, or as the basis of an interactive classroom activity. The Rock Creek 1 and 2 set was used as an out-of-class activity in Computational Geology (a QL course for geology majors) in Fall 2010 and Fall 2011 after the students had worked through several other modules, notably the two about Yellowstone geysers involving histograms. In general, the students considered these modules to be among the more challenging of the collection, but within their range of expectations for level of difficulty. They have not been implemented in the introductory-level Geology of National Parks course.

Assessment

There is a slide at the end of the presentation that contains end-of-module questions. The end-of-module questions can be used to examine student understanding and learning gains from the module. Pre/post test, pre/post test answer key, and answer key for end-of-module questions are at the end of the instructor version of the module.

References and Resources

Rock Creek Park

US National Park Service (NPS)

-