Fade In, Fade Out

 Day in, day out, same old hoodoo follows me about. The same old pounding in my heart, whenever I think of you, and, darling, I think of you, day in and day out.

-Johnny Mercer

A foggy day, in London town
Had me low, had me down
I viewed the morning, with much alarm
The British Museum, had lost its charm

-Ira Gershwin

What happens to the effects of educational and early childhood interventions over time? Do they persist day in and day out, the same old impacts following you about?  Or do you find, one foggy day, that the British Museum has lost its charm?

For cognitive effects (ie, effects on test scores) of early childhood programs, the consensus view is that impacts fade over time- rapidly at first and then gradually. Here, for example, is the average effect size over time from a meta-analysis of 49 impact studies of pre-K programs:

FadeOut Curve

This meta-analysis included some of the early studies of pre-K programs, which tended to find larger effect sizes, perhaps because the counterfactual (where the low income children who were not offered a slot in the program ended up) was worse, perhaps because earlier analyses had a harder time dealing with the inevitable slings and arrows of outrageous fortune that beset any randomized controlled trial. More recent evaluations find much smaller effects at the end of treatment, even before fade out. Here, for example are the effect sizes at the end of treatment across different evaluations over time:

effectsizeovertime

Fade out is persistently observed for recent evaluations with smaller impacts, as well, however. Here, for example, are the test scores over time for the Impact study of Head Start  :

impactheadstart.png

 

The small but measureable difference between the test scores of children in the spring of their Head Start year (Spring 2003) has faded to zero by the following year. The fade-out of cognitive effects extends also to studies in developing countries, where educational interventions often show larger impacts than in the United States, but where a similarly large percentage of the impact is lost after a single year. For a study of private schools in Pakistan, a group of researchers finds that:

Low persistence may in fact be the norm rather than the exception, and a central feature of learning. Low persistence has broad implications. Commonly used empirical methods often make strong assumptions about persistence that can affect both short-run impact estimates and long-run extrapolations. Moreover, to the degree that low persistence is not an artifact of psychometric issues, such as measurement error, changing test content, or cheating, its extent and interpretation also matters. If low persistence arises from economic responses, such as parents and teachers devoting fewer resources to better performers, such substitution may lead to total costs falling enough that welfare actually rises, even when achievement gains fade out.
However, if low persistence arises from biological factors, such as the inherent fragility of human memory, welfare likely suffers due to an unavoidable inefficiency in learning.

Imagine two models of the world:
*Human capital shocks of various kinds have short-term effects that then fade more or less monotonically. If large differences in outcomes suddenly emerge later in life, it is the fault of broken randomization, differential attrition, or other failures in study execution and design. This is true even if the initial samples were balanced on observable baseline characteristics– the later effects were the result of some latent quality, not the intervention itself.

*Human capital shocks produce latent “hidden differences” that only emerge over time. In this model, an initial nudge– a coaching program for teachers, having a “great” kindergarten teacher, or even a 60 minute CBT-based pep talk at the beginning of college– starts someone off on a different path that leads them to having higher adult outcomes– even if the initial effects do not appear so large.

While I can imagine situations in which the second model is correct, to me, the first is the much more plausible story, and  more in line with the available data. What would make me rethink this belief is a series of well-conducted RCTs where the long-term effects are larger than the short-term effects. But every time I see an RCT that claims to do this…something is obviously wrong.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s