Nature of the chisquare distribution
This activity has been undergone anonymous peer review.
This activity was anonymously reviewed by educators with appropriate statistics background according to the CAUSE review criteria for its pedagogic collection.
This page first made public: May 17, 2007
This material is replicated on a number of sites as part of the SERC Pedagogic Service Project
Summary
In this activity, students learn the true nature of the chisquare and F distributions in lecture notes (PowerPoint file) and an Excel simulation. This leads to a discussion of the properties of the two distributions. Once the sum of squares aspect is understood, it is only a short logical step to explain why a sample variance has a chisquare distribution and a ratio of two variances has an Fdistribution.
In a subsequent activity, instances of when the chisquare and Fdistributions are related to the normal or tdistributions (e.g. Chisquare = z^{2}, F = t^{2}) will be illustrated. Finally, the activity will conclude with a brief overview of important applications of chisquare and F distributions, such as goodnessoffit tests and analysis of variance.
link text (Microsoft Word 3.5MB May17 07)Learning Goals
In the second activity, they will be encouraged to explore these relationships and to discover equivalent statistical tests that can used in specific situations. The relationship between the chisquare and z distributions will be underscored by demonstrating that when testing for the equality of two population proportions by two different methods a computed chisquare value will, in fact, be the square of the normal distribution zvalue for the corresponding test.
Likewise, in simple linear regression a test of hypothesis for the slope of a regression line can be performed using ttest or Ftests, where the computed Fvalue is the square of the corresponding tvalue.
Context for Use
The activity can be undertaken at different levels and with different degrees of rigor. At the simplest level it is used to introduce the chisquare distribution as the sampling distribution of a sample variance. This is necessary for inferences concerning a population variance (confidence interval, test of hypothesis). This activity consists of using software (Excel, Minitab, Fathom,...) to generate random samples of a normal variate and then to show that the resulting sum of X and sum of X^{2} approximate a Student's tdistribution and chisquare distribution, respectively.
The simplest activity, as described above, will work well in the large class as a demonstration, or in a computer lab as a handson activity, and can be accomplished in one class period.
The second activity introduces more sophisticated connections involving the chisquare and the Fdistributions, and shows how these can be demonstrated through goodnessoffit tests, ANOVA and regression analyses. These more advanced procedures are appropriate in the latter part of the introductory course or early in the second course when discussing tests of hypothesis for population variances, goodnessoffit tests, and regression analysis.
Description and Teaching Materials
 The activity can be introduced in lecture format with the Powerpoint File describing the chisquare distribution as the sum of squares of values selected from a normal distribution, and showing the relationship with the Student tdistribution represented by the sum of the same 10 values.

A simulation can then be demonstrated using the Excel file containing 2,000 samples (rows) of data in which each sample contains 10 randomly generated values from a standard normal distribution. Each row also shows the sum of the ten values and the sum of the squares of the 10 values. Histograms are drawn for both the sum and the sum of squares.
Pressing the F9 (recalculate) key in Excel causes the entire spreadsheet to be recalculated and the histograms to be redrawn. It is evident from the simulation that the sum of the 10 values generates an almost symmetrical distribution, approximated by Student's tdistribution, while the sum of squares of the 10 values generates a positively skewed histogram, consistent with the chisquare distribution.  A second Excel file includes three worksheets demonstrating (1) the relationship between ChiSquare and z distribution in a test for equal proportions and (2) the relationship between F and t distributions in ANOVA and regression examples
Teaching Notes and Tips
The activity can be introduced in a single class period of at least 50 minutes duration.
An effective way to present the activity is to show the PowerPoint presentation (file chisquare.ppt) followed by the Excel simulation (file ChiSquareSimulation.xls). The instructor wil then be ready to move on to a discussion of the use of the chisquare tables in interval estimation and hypothesis testing examples.
Assessment
 Display histograms and ask students to identify underlying distribution.
 Ask students to match distribution graphs to types of hypothesis test.
 Match sample statistic (mean, total, proportion, variance...) to associated distribution.