Investigating the Modernity of the University Library

This page authored by Pam Arroway, Jennifer Gratton, Steve Stanislav, and Roger Woodard, North Carolina State University.
Author Profile
This material was originally developed through CAUSE
as part of its collaboration with the SERC Pedagogic Service.

Initial Publication Date: May 17, 2007

Summary

This activity makes use of a campus-based resource to develop a "capstone" project for a survey sampling course. Students work in small groups and use a complex sampling design to estimate the number of new books in the university library given a budget for data collection. They will conduct a pilot study using some of their budget, receive feedback from the instructor, then complete data collection and write a final report.


Share your modifications and improvements to this activity through the Community Contribution Tool »

Learning Goals

The project is designed to reinforce the following skills. Students should be able to:
  • construct a reasonable sampling design for the specified population, study objective and budget.
  • describe the selected sampling design using appropriate sampling terminology, e.g., strata, clusters, ratio estimation.
  • justify the choice of sampling design, e.g., using "optimal allocation" concepts and budget as guidelines for sample size, decisions to use subgroups as strata rather than clusters.
  • estimate the parameter of interest and give appropriate standard error.
  • state required assumptions, plausibility of these assumptions and impact of probable violations.

Context for Use

This activity is designed to be appropriate for a college introductory course in survey sampling for any class size. Most of the work for this project is done outside of the classroom. Students should form groups of 3-4 to work most efficiently. The project may be assigned in the last third of the semester, with the final report due at the end of the semester.

Before beginning the project, students need to be familiar with various sampling strategies (design features and appropriate estimators):
  • Stratification
  • Clustering
  • Ratio estimation (optional)
  • Complex survey analysis

Description and Teaching Materials

A handout for students describing the project assignment (Microsoft Word 33kB Aug10 06). Includes introduction to the project, suggestions for getting started and requirements for written reports.

Teaching Notes and Tips

Getting students started. This project can be introduced at any point in the semester, but is probably best after students have learned about stratification and clustering. Encourage students to wander around the library to get familiar with the population. They have likely been to the library before, but haven't looked at it through the lens of this project. It is very easy for students to come up with complex designs, often using stratification and clustering, though they may have difficulty labeling strata and clusters. Correctly identifying strata and clusters will be key to choosing appropriate estimators and to learning to communicate with standard statistical terminology.

Use of online card catalogs. Online catalogs do not provide a good estimation of "new" books as easily as students may think. The target population may not match the online frame exactly. (It is often possible to define the population as "easily accessible" items or items in certain section of the library, which is not necessarily easy to identify from a database.) Ratio estimators using online information can also be encouraged and/or use of online information to guide sample size allocation.

Optimal allocation rules. Students may be tempted to try to apply optimal allocation rules that have been learned in class. Emphasize that these rules should guide (not dictate) resource allocation. For example, larger strata should generally have larger sample sizes, but strict proportional allocation is not required for a good design. It is often impossible to use optimal allocation rules exactly in practice, because population parameters are unknown, e.g., Neyman allocation in a stratified design. They will also need to consider the budget restrictions in determining how to allocate resources (stratification vs. clustering, sample size decisions).

Keep it simple. Use of budget is key to forcing students to use clustering effectively but it is easy for them to develop an overly complex design. Remind them to consider appropriate estimation strategies as they develop the sampling design. Simplified strategies which depend on reasonable assumptions should be considered. For example, use systematic sampling and assume the simple random sampling variance formulas are conservative OR assume that variance due to third and higher stages in a cluster design is negligible relative to first and second stage variance components.

Sources of bias. Discussion of selection bias, measurement error and nonresponse in this study may be incorporated as whole class discussion. Student often have trouble separating these concepts. An obvious problem is the potential bias of checked out books. Are checked out books more likely to be "new" books? This may be used for class discussion when the project is introduced.

Caveats regarding the report

  • Make sure students are estimating total number of books (not proportion). Total number of books is harder to estimate.
  • In the reports, there is a difficulty in terminology (What is a shelf, stack, row, aisle, etc.?). Ask students to provide a diagram labeling their terminology.
  • Instructor may want to enforce strict penalty for going over budget or using less than, say, 85% of budget.
  • The pilot study requires a projected standard error calculation. The idea is that students should get some sense of what their final project standard error will be. Students have difficulty calculating this. An example in class may help clarify: Present data from SRS of size 10. What kind of standard error would you expect if a sample of size 100 was taken?
  • Students may appreciate a summary (after projects are turned in) of estimates, standard errors, and use of strata and clusters. You may also give some recognition/points for lowest standard error.
  • Giving students a grading rubric before they turn in the report can vastly improve organization and quality of the reports, but can also be overly prescriptive. More mature students, e.g., graduate students, may not need a point-by-point outline for the structure of the report.

Assessment

  • Pilot Study Report is designed to give students feedback on writing and correctness of estimation procedures. (Data collection is not complete at this stage, so some groups may modify entire design.) A rubric (Microsoft Word 41kB Aug10 06) for instructors to use in grading the pilot study report can also be distributed to students with the project assignment.
  • Final Report is used as final assessment. A rubric (Microsoft Word 43kB Aug10 06) for instructors to use in grading the final report can also be distributed to students with the project assignment.
  • Suggested division of points:
    • 30 for Pilot Study Report
    • 70 for Final Report
  • Follow-up questions (Microsoft Word 12kB May17 07) on a test can be used to assess how well students can identify strata and clusters in a given design. The familiar context makes it easier for students to quickly understand the population.
  • Other ideas for assessment
    • Competition / reward for lowest standard error
    • Have students give oral reports describing the design and their estimates
    • Have students critique other groups' oral reports
    • Whole class discussion of why some groups achieved lower standard errors than others

References and Resources