Research Paper in Introductory Econometrics

This page authored by Steve DeLoach, Elon University.
Author Profile
This material is replicated on a number of sites as part of the SERC Pedagogic Service Project
Initial Publication Date: July 30, 2010


Through this independent research project, students experience the process of doing real economics research using appropriate econometric methods.

Used this activity? Share your experiences and modifications

Learning Goals

Students will:

  1. Develop an understanding of how economists conduct applied research.This means more than simply learning the statistical methods. In order to use the methods appropriately, students must know the underlying theory as well as the existing literature on the issue.
  2. Develop important (marketable) computer skills. To handle the large data sets and complex econometric techniques several specialized software packages have emerged in the market. The program used in this class SAS. It is one of the most widely-used statistical programming languages in the world. While some work can be done with minimal knowledge of SAS coding, it is important for students to learn the basics of SAS syntax and logic to be an efficient econometrician.
  3. Develop the ability to critically evaluate others' research.
  4. Develop written and oral communication skills.

Context for Use

Applied research papers in econometrics classes are common across the discipline. Some supply datasets to students to use to replicate "famous" results, while others require students to collect their own data. I use replication as a first project in the course (midterm project) and require an independent research project as the capstone learning experience for the course.

Description and Teaching Materials

The research paper serves as a capstone to the course. Students find their own topic, research the literature, collect data and use appropriate econometric techniques to analyze the topic. To facilitate the process, student are required to submit a proposal (Acrobat (PDF) 264kB Jan29 10) for their paper. This happens immediately after fall break, at the mid-point of the semester. By that time, students have a good grasp of multiple regression, including basic modeling issues like log transformations, scaling, etc.

Teaching Notes and Tips

The key is scaffolding the 6-week process so that students end with an econometrically rigorous and (relatively) complete paper. Of course, there is no way these kinds of papers can meet the level of thoroughness that you would expect out of a semester- or year-long independent research project. The instructor has to make deliberate decisions on where students should and should not devote their scarce time.

One area I sacrifice in is the literature. This is not a thesis and does not require a full-blown literature review. Having said that, students do need to have read at least a handful (6-8 is a reasonable expectation) of papers on the topic. I should note that many of these papers are related to other term papers students have written or are writing in their upper-level electives. For example, a student writing a paper about the literature on the gender-wage gap for a labor class will already have an extensive knowledge of the literature. The implication here is that there are spillovers from other classes that make use of to make this project successful. What they will not have done in that class, however, is to have done a full-blown, rigorous econometric study. Many of the papers I see in econometrics are like this.

The most challenging part is to get them to develop enough of a theory so that they can make the appropriate econometric decisions. For example, if they are looking at the price of beach-front housing on the coast, they need to understand and explain the appropriateness of the Hedonic model and its assumptions in the context of this market to have addressed the question of simultaneity. Otherwise, students are doing little more than an "applied regression" paper (a statistics project vs. an econometrics project).

Furthermore, since the papers are individualized, each topic and dataset will present its own unique set of econometric challenges. These include (1) multicollinearity, (2) incorrect functional form, (3) heteroscedasticity, (4) autocorrelation, (5) omitted variables, (6) measurement error and (7) simultaneity. Students are expected to address the relevant problems in a satisfactory way. The challenge is to get them to think about their data and theoretical problems early on so that it is not merely an exercise in data-mining.

To help with that, I have developed a series of short homework assignments to (1) keep them on task and (2) lead them to address the requisite issues that are addressed in the course:

  • After their proposals are approved, I require a 2-page written summary where they discuss each independent variable theoretically. They explain from theory the effects it should have on the dependent variable and why. In addition, I ask them to pay close attention to two things: (1) whether the variable is endogenous and why (or, if it is exogenous they must justify that); (2) whether the theoretical relationship is linear or non-linear. I also ask them to sketch the XY scatter plot from a theoretical point of view (remember: they have not collected the data yet).
  • Following that, their data are due. They are required to come to class with the data imported into SAS. I check their data one-by-one and we discuss issues of dummy variables, transformations, etc. The data are generally due 10-14 days after the proposal. This leaves 4 weeks in the semester for them to complete the econometric work and write the paper.

The final challenge has to do with the timing of content. As they are doing their papers over the last 4 weeks, we are covering topics such as limited dependent variables and panel data. I end the delivery of new content the before Thanksgiving, leaving 2 weeks of class time for them to work on their projects in class and have an in-class final exam (I use the final exam period for presentations). This timing means that students doing topics using, say, logistic regression, do not have that knowledge until 2 weeks left in class. Thus, I have all students begin with benchmark OLS regression model. A lot of diagnostics can be done at this stage, even if OLS is inefficient due to the non-linearities. For example, multicollinearity can be dealt with in OLS.


Since this is an econometrics class, assessment of the papers are biased towards the course objectives. As with all papers, of course, I do expect them to be well-written and complete. But those criteria are treated more as "minus" if they are not up-to-par rather than something that will make the difference between, say and A and a B. Thus, grades are determined by:

  1. whether the economic theory and the specification of the benchmark econometric model are consistent;
  2. the extent to which the student has correctly diagnosed the relevant econometric problems;
  3. the extent to which the student has dealt with the econometric problems in an appropriate and convincing way;
  4. the extent to which the paper is well-written and complete (e.g., is there a reasonable introduction with a clearly defined thesis? has the student done a reasonable amount of literature review for a semester project? has the student written a reflective discussion of the results?)

References and Resources

Damodar Gujarati. Essentials of Econometrics. 4th Edition, New York: McGraw Hill.

Steven A. Greenlaw, Doing Economics, Houghton Mifflin.