A couple weeks ago I argued for “Toad’s Corollary” as a general guideline for empirical analyses of social interventions:
If long-term impacts of a social intervention are larger than short-term impacts, in either the negative or positive direction, you’re probably/might be/could be missing something.
It occurred to me that there’s a familiar-among-economists version of this principle that might be worthwhile thinking about. This is the so-called “Ashenfelter’s Dip,” named for Orley Ashenfelter and discussed more systematically by James Heckman in a 2000 article.
The idea is that people who end up in government programs often end up there because something bad happened to them. If they are in a job training or employment program in particular, it could be because they recently lost their job, or because there was a downturn in their region or sector of the economy that caused a training program to be opened up. Consequently, Ashenfelter observed that the earnings of people in programs showed a “dip,” for about a year before they entered a program. Something like this:
Why is this important? Well, because a lot of times, economists and other social scientists try to determine the effects of a program using a “difference-in-differences” or matching design.
For example, let’s say we’re trying to estimate whether a new program for unemployed people in California works to get them back on their feet and earning money again. Everyone in California who is unemployed gets the program, so we can’t randomly assign it. Instead, we’ll compare the California unemployed people to unemployed people in Nevada. Since Californians and Nevadans don’t have the same income, we look only at the portion of Californians who have Nevadans with very similar earnings over the 12 months before the evaluation starts.
The problem here is that the Californians are different from the Nevadans– they’ll bounce back to a higher level regardless of what the program does. Moreover, they’ll keep getting more different, the further away from the program in time we get. We could call that a long-term result of the program, but much more likely is that the Californians are just recovering from the initial shock that caused the “Ashenfelter’s Dip” in the first place.
To make this a little more quantitative, we can imagine that both groups have some measure of human capital, for example SAT scores. Let’s say the Californians are in group 1 (and will be getting the program) and the Nevadans are in group 0 (they won’t be getting the program):
Before whatever bad thing happened to the Californians (group 1), they are making about $10,000 more on average than the Nevadans (group 0), with their earnings somewhat random but somewhat based on their human capital.
The two groups are matched on their earnings in Year 2, the program occurs in Year 3. The program has zero impact, but everyone’s earnings regress to their individual mean determined by their human capital. As a group, everyone’s earnings wanders around somewhat randomly over time, but the Californians’ gradually rises back towards their pre-bad thing level. The negative shock to the Californians’ earnings (blue below) wears off over time, while the Nevadans’ earnings (red below) stays at their lower, stable level.
Or, just looking at the differences between the two groups:
|Year||Difference in Earnings (California-Nevada)|
Our “estimated treatment effect” (which is really just the negative shock wearing off over time) keeps getting larger: naively, we might think that the program has larger long-term effects than short-term effects, but really this is a sign of the differences between the two groups emerging over time.
An obvious extension of this is to education programs: just because you can match on baseline test scores at the time the kids enter the program, doesn’t mean that they are “really” the same kind of kids. This is particularly true since many cognitive characteristics won’t be stable and reliably observable until later in adolescence or even adulthood.
But what if intelligence in general is a kind of Ashenfelter’s Dip? That is, one of the ways humans are different from other animals is how very incompetent we are as children, how much care and support we require over time. Our neoteny as a species– our carrying on of fetal and infant characteristics long into development- appears to be intrinsically related to our capacity for general problem solving and creativity. This immediately suggests a problem with “matching” one group of humans to another at an early stage of development to determine the effects of later programs or experiences. Just because one group might appear more precocious, on one measure or another, does not mean that they will end up more cognitively competent at a later date. Instead, they might be simply less neotenous and have less ground to make up coming up.