Introduction to Analysis of Variance (ANOVA)
This activity introduces graduate-level students to one-way (single-factor) analysis of variance (ANOVA). The students are familiar with MATLAB, but have little-to-no experience with ANOVA. The activity is motivated by a short video that demonstrates a classic historical ANOVA case (the 1970 military draft lottery), includes a pre-class assignment requiring the review of two MATLAB live scripts and an assignment available in MATLAB Grader (formerly Cody Coursework), and concludes with in-class discussion and practical exercise using a MATLAB live script.
At the conclusion of this activity, the student will:
- Understand the concept of single-factor (or one-way) Analysis of Variance (ANOVA).
- Know how to "manually" compute SSTr, MSTr, SSE, MSE, and SST using MATLAB.
- Know how to use SSTr, MSTr, SSE, MSE,and SST to manually construct an ANOVA table.
- Know how to use MATLAB to generate the ANOVA table.
- Understand the F probability distribution and know how to use an F-test to conduct hypothesis tests for single-factor ANOVA situations.
The student will gain the following MATLAB skills:
- Compute summary statistics organized by group: grpstats(X, group)
- Use the F-distribution to conduct analysis of variance:
- Use the F inverse cumulative distribution function to compute a critical value: finv(1-alpha,nu1,nu2)
- Use the F cumulative distribution function to find a p-value: fcdf(x,nu1,nu2,'upper')
- Conduct one-way analysis of variance: anova1(y, group)
Context for Use
This activity was developed for a graduate-level course in Statistics and Data Analysis with an hour-long pre-class assignment, similar to what would be required in a flipped-classroom, and one-hour of in-class discussion and demonstration. In a traditional classroom setting, this activity could be reproduced, with the first hour of lecture encompassing the activities listed in the pre-class assignment.
The students should have already completed standard instruction on descriptive and inferential statistics, as well as hypothesis testing. The MATLAB Grader exercise(s) and MATLAB live script(s) can be adapted for use with any appropriate data set.
Description and Teaching Materials
- Pre-Class Assignment
- Demonstrate why Analysis of Variance is important through a historical lens:
- Video: https://youtu.be/VJO-NI07yLs
- Article: Rosenbaum, David E. "Statisticians Charge Draft Lotter'y Was Not Random." The New York Times, The New York Times, 4 Jan. 1970, www.nytimes.com/1970/01/04/archives/statisticians-charge-draft-lottery-was-not-random.html.
- Complete assigned reading from text:
- Read: PS4ES9e, Chapter 10, Section 10.1, pages 409-420. Probability and Statistics for Engineering and the Sciences, 9th Edition, (2016) by Jay Devore. Published by Cengage Learning, Boston.
- Review Single Factor ANOVA Live Script (MATLAB Live Script 11kB Aug14 18)
- Review F distribution and Analysis of Variance (ANOVA) (MATLAB Live Script 56kB Aug14 18)
- Complete single-factor (one-way) ANOVA problem in MATLAB Grader:
- Use "Comment" tool in LMS to submit solution and/or questions for instructor review prior to class.
- Review student performance on MATLAB Grader; highlight interesting code and/or methods of solutions discovered by the students.
- Respond to submissions on discussion board; field questions from students.
Teaching Notes and Tips
The Single Factor ANOVA Live Script (MATLAB Live Script 11kB Aug14 18) explores the concepts of Total Sum of Squares, Error Sum of Squares, and Treatment Sum of Squares from a graphical perspective which introduces the students to the various components of the ANOVA table prior to demonstrating the MATLAB built-in ANOVA commands. The data used was selected to best demonstrate the concepts; however, the file could be adapted to other data sets.
At this point in the course, my students had not been introduced to the F distribution, so F distribution and Analysis of Variance (ANOVA) (MATLAB Live Script 56kB Aug14 18) was necessary to bridge the gap. If students are familiar with the F-distribution, this portion of the pre-class assignment could be omitted.
MATLAB Grader is not integrated with our learning management system. Accordingly, completion of the assignment in MATLAB Grader is optional. However, as you can see from the screenshot above, I have found that most students use MATLAB Grader (especially when learning new topics and skills in MATLAB) because it provides instant feedback.
To motivate the students to complete the pre-class assignment, the students are awarded 10 points once they have posted on the discussion board. Over the course of the quarter, these pre-class assignments account for 25% of their overall course grade.
Typically, I do not assign a grade the in-class practical exercises. The learning objectives and MATLAB skills are assessed on a bi-weekly laboratory assignment and corresponding report.
References and Resources
Textbook: Probability and Statistics for Engineering and the Sciences, 9th Edition, (2016) by Jay Devore. Published by Cengage Learning, Boston. Problems in MATLAB Grader are taken directly from the text.
Article: Rosenbaum, David E. "Statisticians Charge Draft Lotter'y Was Not Random." The New York Times, The New York Times, 4 Jan. 1970, www.nytimes.com/1970/01/04/archives/statisticians-charge-draft-lottery-was-not-random.html.
Video: Werner, Mark. "1970 Draft Not Random." YouTube, 10 Jan. 2013, https://youtu.be/VJO-NI07yLs.