Tips for Using Real Data

Assignments which use real data can be very engaging and offer opportunities for students to work on quantitative reasoning (QR) along many dimensions. But real data often increase the complexity of assignments. Data access can be time consuming and mistakes in the acquisition process can doom subsequent work. Missing or poorly measured data can compromise interpretations. Codebooks may include more or less detail on variable construct. The list goes on and on. Of course, all of these challenges are present in the real world which is precisely why teaching with real data provides a uniquely powerful learning opportunity. A few tips can help instructors maximize these learning advantages while avoiding the greatest frustrations.

  1. Consult a reference librarian when selecting a data source
  2. Narrow the scope
  3. Consider using teams
  4. Plan for extra time

1. Consult a reference librarian when selecting a data source

If you don't already know where you would get the data for an assignment, consult a reference librarian. They may point you to a specific data set. For example, the General Social Survey website includes online tools that facilitate basic analysis. Alternatively, they may point you to collections of datasets like the Interuniversity Consortium for Political and Social Research. Or they may point you to a data compendium like the Statistical Abstract of the United States which serves as an introduction and gateway to a whole host of other sources. Many libraries have recently hired reference librarians whose job includes dataset curation; tap into their expertise!

2. Narrow the scope

Engaging with real data raises a series of potential problems, all of which can take considerable time and can derail an assignment. You want to make choices in advance so that the place where students spend their time is precisely where you want them to be growing through productive struggle.

  • Consider extracting a subset of the full dataset that includes only the relevant variables.
  • If you want them to experience the process of data extraction (without it being a major learning goal of the exercise), consider giving students step-by-step instructions. This way they have the experience without the full struggle.
  • Use existing secondary sources that summarize the raw data--or create such summaries yourself--if performing the summarization isn't one of your goals for the assignment.
  • Simplify the problem. While theory may call for 10 independent variables, is there something you can do with a stripped down version that focuses on only one? (You can always ask students to identify omitted variables and predict the resulting bias from those omissions.)

3. Consider using teams

Assignments using real data call for a wide range of skills from internet searching to spreadsheet manipulation to statistical analysis to presentation. Depending on the experience level of your students, it may be too much to expect one student to embody all of these. By using a brief survey to learn about students' strengths, you may be able to create teams with complementary skill sets. Whether the entire assignment is completed as a team or you ask that the final product be completed individually, team work may reduce pressure on you to provide skill remediation.

4. Plan for extra time

As noted above, working with real data can raise a host of challenges for students. It is easily possible for a student to lose several hours to even one of these struggles. As a result, you need to build in extra time for them to get through these sticking points. You also should expect a greater need for individualized help in office hours or with teaching assistants.

As the above points make clear, working with real data can raise a host of challenges for students.

SSDAN Logo ICPSR logo Teaching with Data Logo Teaching with Data Logo