# Frequency of Large Earthquakes -- Introducing Some Elementary Statistical Descriptors

This material is replicated on a number of sites as part of the SERC Pedagogic Service Project

## Summary

In this Spreadsheets Across the Curriculum activity, students examine the number of large (magnitude 7 or larger) earthquakes per year in the 30-year period, 1970-1999. They build spreadsheets to find the mean, median, modes, max, min, range, variance, standard deviation, interquartile range, and a variety of percentiles. They compare three ways of determining quartiles. They compare the frequency distribution of earthquakes per year with the normal distribution using percentiles calculated from the data. In the end-of-module assignments, they use their spreadsheets to explore the data from the preceding 30 years (1940-1969) and the whole 60-year period (1940-1969). The data are from the QELP (Quantitive Environmental Learning Project) Website. The module includes links to QELP and some U.S. Geological Survey sites about earthquake frequency and magnitude.

## Learning Goals

- Gain experience in calculating measures of center (mean, median), spread (standard deviation, range, standard deviation, interquartile range), and character of a frequency distribution (quartiles, percentiles).
- Use Excel functions to do the same calculations easily.
- Calculate the median, quartiles and other percentiiles from data by linear interpolation.
- Plot the data as a cumulative frequency diagram and compare it to a plot of the cumulative normal distribution with the same mean and standard devaition.
- Compare the frequency of earthquakes in two 30-year periods and compare the statistics in a 30-year period with those of a 60-year period.
- Find that there are rather consistently 15-20 major (magnitude 7-7.9) earthquakes per year and one great (magnitude 8 or more) earthquake per year. They will also note that comparatively few of the large earthquakes occur in the U.S. They will be reminded that each step in the magnitude scale corresponds to a 32-fold increase in strength (energy release)

- Learn how to calculate basic descriptive statistics with Excel.
- Appreciate more the usefulnes of proportional thinking (interpolation).
- Get a visual impression of a normal distribution both from nearly normally distributed data and from Excel's built-in normal distribution function.
- Get a sense from data that a large earthquake occurs somewhere in the world rather frequently (recurrence interval about 3 weeks).

## Context for Use

I use this module in my Computational Geology course, GLY 4866 (Acrobat (PDF) 39kB Sep25 06). The class consists of students who anticipate graduating in three or fewer semesters. The class serves as the capstone for the mathematics courses required for the geology major (throuogh Calculus 2.

As is typical of capstone courses in geology (e.g., field camp), the GLY 4866 draws freely and without warning from material anywhere in the students' prior experience. Weekly reading assignments pace through a quantitative literacy textbook aimed at a general audience (Understanding our Quantitative World by Janet Anderson and Todd Swanson). The book provides a way for the students to look back at their beginnings after ascending to their somewhat lofty position of weathering calculus. Thoughtful students come to appreciate "Everything I ever needed I learned in Kindergarten."

This module on earthquake frequency comes early in the semester, on the day on which reading Chapter 3(Data) is due. I start the in-class session by handing out a page with the 30 years of data. The students divide up into groups to calculate the descriptive statistics with pencil, paper, and calculators (using brute-force methods, not magic keys with STD, for example).

The students do not enjoy the calculation of standard deviation. They reported that they had been told different formulas in different courses (e.g., physics, chemistry, geology, biology). They were referring to n vs. n-1 in the denominator. So, we discuss the difference between sample and population, why I chose the population statistics (n in the denominator) in this module, and why I used 30 years of data. I had the impression that the students were somewhat taken aback with the notion that both the "n professors" and the "n-1 professors" could have actually have been correct.

The students were annoyed that there were three seemingly reasonable ways to calculate the first and third quartiles and that they produced different results in this case. The more vocal students knew about the 25th and 75th percentiles for Q1 and Q3, respectively, but were less confident about calculating them from the data. The calculation provided a good entry to interpolation and proportions and set the stage nicely for Chapter 6 on linear functions. Next time around, I will allow more time for a side exercise to emphasize the concept of linear interpolation more explicitly.

The module comes live on Blackboard during the in-class session. The students work through the module as homework. The end-of-module assignments are due in a week.

## Description and Teaching Materials

SSAC2006.QE531.LV1.6-student (PowerPoint 412kB Jul26 10)

The module is a PowerPoint presentation with embedded spreadsheets. If the embedded spreadsheets are not visible, save the PowerPoint file to disk and open it from there.

This PowerPoint file is the student version of the module. An instructor version is available by request. The instructor version includes the completed spreadsheet. Send your request to Len Vacher (vacher@usf.edu) by filling out and submitting the Instructor Module Request Form.

## Teaching Notes and Tips

## Assessment

The end-of-module questions can be used for assessment.

The instructor version contains a pre-test