National Numeracy Network > NNN Blog

NNN Blog

Standard Errors or Typical Errors?

Posted: Oct 31 2014 by Nathan Grawe

Nate Silver has made election prediction sexy. This election cycle I've seen many estimates of the probability for a Republican takeover of the Senate. And when I say "many estimates" I don't mean different sources; I mean vastly different probabilities. Today I see the New York Times gives Republicans a 70% chance'] while the [link Washington Post puts the figure at 95%. (Sliver sets the probability at 68.9%. We can take up the topic of over-articulated precision in another blog!) In this context, those numbers mean very different things.

So, what's the source of variation in these estimates? Earlier in this election cycle one explanation was that people were estimating different probabilities. You could find estimates for "the probability that Republicans would take control if the election were held today" and for "the probability that Republicans will take control on election night." In the middle of summer, these are two vastly different concepts because the latter allows for the wide range of events that might shift elections over the course of three or four months.

But surely that can't be the explanation for the divergence in today's Timesand Postestimates. Even if they are asking different questions regarding timing, we are less than 4 days from election day and many ballots have already been cast. It seems clear that the differences here are due to model specification. When students learn statistics, we teach them how to construct standard errors to account for random sampling error. There's nothing wrong with that, but as these election forecasts make clear, the far more typical specification error often swamps sampling error.

Fortunately, the idea of omitted variables bias or other specification error can be intuitively understood by undergraduates regardless of their mathematical prowess. Happy Election Weekend!

Compared to What: Infectious Disease Edition

Posted: Oct 10 2014 by Nathan Grawe
The Washington Post has a great online infographic comparing attributes of the spread of Ebola to those of more common diseases such as Chicken Pox or Influenza. The site really drives home how important it is to provide context when presenting data to people who are not intimately familiar with the topic. (While have some understanding of Influenza transmission from personal experience, that experience doesn't translate well into the statistics on transmission, for example.) I could image giving students the Ebola data only and ask them to draw some conclusions. Then I could give them the comparison data and ask how that added information alters their understanding of what is happening and how we might want to respond.

Spurious Correlation

Posted: Oct 6 2014 by Nathan Grawe

Tyler Vigen has some great examples of correlations that are surely not evidence of causation. One downside of the examples list, however, is that they are all time series. The underlying problem is that the two time series considered are not stationary; they are both trending which explains the high correlation. (The one exception is the example of the numbers of Nicolas Cage movies and people drowning after falling into swimming pools. I may be wrong, but to my eye those two series look stationary.)

One other interesting fact I learned from the site: The number of people who die by becoming tangled in bedsheets has more than doubled in the last decade to almost 800. What can explain this steadily growing national epidemic?!

All Depends on How You Count

Posted: Oct 3 2014 by Nathan Grawe

The most recent jobs report is great fodder for student discussions of basic QR-in-measurement issues. This story from Yahoo provides a succinct summary. The table in the middle of the article motivates discussions about:

  • How do we define unemployment? Do we want to adjust that measure for underemployed?
  • Is employment always employment? Does it matter what kind of job people hold?
  • How does the current labor market compare with that at other times (in particular, the time before the financial crisis)?
Ultimately, the table could motivate a good, all-around discussion of the importance of seeking multiple measures if you want a complete understanding of a complex issue like economic recovery.

Models of Ebola

Posted: Oct 2 2014 by Nathan Grawe

If you are looking for an interesting application of differential equations (and who isn't!), the Ebola outbreak and its recent arrival to the US are an opportunity. This short paper by Astocio et al. provides a great starting point with data calibrated to earlier Ebola outbreaks. Interesting policy questions: If we want to combat the spread of Ebola, how critical are quarantines of the potentially infected? Does it make sense to try to stop all travel from infected states/regions to your locale? (Hint: How many failures of such a policy are required before the entire effort is wasted?)

While many students will not have had differential equations, the intuition of the math should be accessible to most undergraduates.


« Previous Page      Next Page »