Tips for Using Real Data
- Consult a reference librarian when selecting a data source
- Narrow the scope
- Consider using teams
- Plan for extra time
If you don't already know where you would get the data for an assignment, consult a reference librarian. They may point you to a specific data set. For example, the General Social Survey website includes online tools that facilitate basic analysis. Alternatively, they may point you to collections of datasets like the Interuniversity Consortium for Political and Social Research. Or they may point you to a data compendium like the Statistical Abstract of the United States which serves as an introduction and gateway to a whole host of other sources. Many libraries have recently hired reference librarians whose job includes dataset curation; tap into their expertise!
Engaging with real data raises a series of potential problems, all of which can take considerable time and can derail an assignment. You want to make choices in advance so that the place where students spend their time is precisely where you want them to be growing through productive struggle.
- Consider extracting a subset of the full dataset that includes only the relevant variables.
- If you want them to experience the process of data extraction (without it being a major learning goal of the exercise), consider giving students step-by-step instructions. This way they have the experience without the full struggle.
- Use existing secondary sources that summarize the raw data--or create such summaries yourself--if performing the summarization isn't one of your goals for the assignment.
- Simplify the problem. While theory may call for 10 independent variables, is there something you can do with a stripped down version that focuses on only one? (You can always ask students to identify omitted variables and predict the resulting bias from those omissions.)
Assignments using real data call for a wide range of skills from internet searching to spreadsheet manipulation to statistical analysis to presentation. Depending on the experience level of your students, it may be too much to expect one student to embody all of these. By using a brief survey to learn about students' strengths, you may be able to create teams with complementary skill sets. Whether the entire assignment is completed as a team or you ask that the final product be completed individually, team work may reduce pressure on you to provide skill remediation.
As noted above, working with real data can raise a host of challenges for students. It is easily possible for a student to lose several hours to even one of these struggles. As a result, you need to build in extra time for them to get through these sticking points. You also should expect a greater need for individualized help in office hours or with teaching assistants.
As the above points make clear, working with real data can raise a host of challenges for students.