The Coin Flip

The National Impact Study of Head Start in the early 2000s found slight, positive effects on test scores that faded out by 1st grade.

And in fact most pre-K programs have found the same thing- a consistent, more-or-less exponential decay in the effect sizes of interventions. Here, for example, is the average effect size over time from a meta-analysis of 49 impact studies of pre-K programs:

In 2016, in a post about fade-out in pre-K programs I proposed a general rule of thumb:

Human capital shocks of various kinds have short-term effects that then fade more or less monotonically. If large differences in outcomes suddenly emerge later in life, it is the fault of broken randomization, differential attrition, or other failures in study execution and design. This is true even if the initial samples were balanced on observable baseline characteristics– the later effects were the result of some latent quality, not the intervention itself.

More pithily, this is Toad’s Corollary:

If long-term impacts of a social intervention are larger than short-term impacts, in either the negative or positive direction, you’re probably missing something.

 By which I meant-

We should expect fade-out. People are changing all the time, and kids are really changing all the timemostly in biologically determined ways and mostly having little to do with which school they attend. If a social intervention seems to do nothing for the first couple years and then totally screw up the kids several years down the line (when they haven’t had any interaction with the program in years)– maybe the kids were different to begin with in ways we didn’t notice.

We can summarize this as the well-established finding that the heritability of most traits increases over the course of childhood: as children age, they become themselves, they’re not just buffetted all over the place and end up with their attributes determined by a pure random number generator. Our lives may be the results in large parts of coin flips, but many of the most important ones occurred before we were born.

See Figure 1, “Genetic and Environmental Influences on Cognition Across Development and Context”

In the last few years, the hype and hoopla over pre-K programs has mostly settled down, but a well-publicized study of long-term follow-ups from the state-wide Tennessee voluntary pre-K program gives a chance to apply these principles and see how they fare. Indeed, this study appeared to show that the pre-K program was mildly beneficial for the first year or two (as measured by various kinds of kindergarten readiness) and then steadily and inexorably became harmful in the long-term, first in behavioral outcomes, and then in academic test scores in math, science, and reading, major disciplinary infractions, and being placed in special ed. The kids who were in the pre-school program are, by 7 years later, about a third more likely to be placed into special ed, have major disciplinary issues, and so on. The seemingly minor but positive effects of the program in the short-term seemed to have disastrous long-term effects.

It’s certainly possible that the average pre-K program introduced by a state now is worse than nothing- in fact the secular trend seems to be towards negative impacts. But it’s important to remember just how many things have zero impacts in the long term- for example, Dutch mothers and babies were literally starved for the Hunger Winter of 1944-1945 and this had zero effects on the babies’ IQ once they grew up and had their IQs measured as adults:

“Prenatal nutrition seems not related to mental performance at age 19”

So I think it helps to think through where these kinds of measured effects in the Tennessee study and similar ones come from, and what is actually happening to produce these estimates.

A mother of a three-year-old, looking for childcare, calls up her local pre-school office and asks if she can enroll. The program is oversubscribed, the person on the other end replies, but they are holding a lottery, and if she agreed to participate,  her child could get in, either this year or the next.  If she is selected by the lottery, her child can enroll this year; if she is not selected she would have to wait another year until her child is four. She and her child would also be followed- taking surveys and sharing test scores and other information with some government researchers, over a period of several (or many) years.

She agrees, they check her eligibility, and an hour or a day later, they call her back. She got in or she didn’t; she can enroll her child or she has to wait a year, she’ll be in the treatment group or the control group of the study. They’ll follow up with her and her child and her child’s teacher, over and over, and on average, the difference between the treatment and control group are the impact, the net effect, of a year in pre-school.

But even this example is not as clear-cut as it might appear. She might have decided that the study sounded too invasive or realized she didn’t want her child to be in pre-school after all; she could have declined the study or decided even after being a “winner” in the lottery that her child shouldn’t enroll and should stay at home for another year or go to a private or church pre-school. She could have died, or moved away. She could have such good connections with the local pre-school program director that they let her kid in even after he was a “loser” in the lottery. She could choose not to pick up the phone the following year when the survey firm the researchers hire calls again, and again, and again. Her child could be absent when the follow-up reading test is administered.

This is, in fact what happened in the Tennessee study. In the supplementary materials, the authors provide this cohort diagram (which already, they note, does not include an additional 141 children who were dropped after randomization because they never enrolled in a Tennessee public school, but could easily provide some selection bias.) 3,131 kids were randomized into the treatment group (invited to enroll in pre-school) and the control group (not invited to enroll in pre-school), of whom only 86% of the treatment group enrolled and only 65% of the control group did *not* enroll:

Of the 1,931 kids ever randomized to the pre-school program, about 1,600 ever enrolled, and of the 1,200 kids randomized to not go into the pre-school program, about 400 did end up going into the pre-school program ,in spite of being assigned not to. At best the researchers can thus only reliably generate what is called an “intent to treat” estimate of the program’s effects– how much a child and parent’s outcomes change, on average, simply from being offered a spot in the program through the lottery– not how much participating in the program actually changes a child’s outcomes, and still less how pre-school changes outcomes in other times and places, or how the country would change if there were no pre-school programs at all.

In the case of the Tennessee study, even these kinds of approximations are probably overoptimistic for capturing long-run effects. The actual effects on 6th grade test scores, the longest-run outcomes they have available, are based on about 600 kids in the treatment group and 330 kids in the control group:

That’s 600 out of 1,931 randomized to the treatment group (31%) and 330 out of 1,200 randomized to the control group (27%). But remember, only 86% of the treatment group members actually enrolled in the pre-school, and only 68% of the control group members didn’t enroll. So in the end, only about 26% of the kids randomized to the treatment group actually did the pre-school program and did the 6th grade follow-up test, and only about 18% of kids randomized to the treatment group *didn’t* do the pre-school program and did the 6th grade follow-up test. There are technical fixes and statistical patches to try to address non-response bias, cross-over, or low take-up rates of the program or differences in take-up rate in by group, many of which the authors attempt to implement. But in the end, extrapolating from the differences between the 26% of the treatment group that was assigned to the pre-school program, actually enrolled in pre-school, and kept in touch through 6th grade, and the 18% of the control group that was assigned to not enroll in pre-school, did not in fact enroll, and kept in touch through 6th grade, seems like a fool’s errand. The extent of the attrition makes it overwhelmingly likely that the differences between kids, that seem to grow year by year, are something about the kids and not about the costs and benefits of a year in public pre-school most of a decade ago. I haven’t seen any write-ups of the study that have picked out just how tiny the sample is that the authors are making their inferences based on; even Emily Oster, who mentions sample attrition, reads the paper as being based on 2400 students followed through 6th grade, while the test score impacts are in fact based on only 900 students out of 3131 randomized.

Whether or not the authors could have made more clear how tenuous the link between the experiment and their actual data, there is an implicit problem in defining causality as the kind of thing you can learn from an experiment: any experiment becomes only a thing-in-itself, and not a clue to a broader order or understanding of the world. In education, there are a multiplicity of slogans posing as theories (“Using Our Understanding of How the Brain Works to Guide Instruction”) and a near infinitude of competing agendas and programs, but very little of an attempt in recent years to advance a coherent theory of how children develop and learn, in a way that acknowledges relevant facts instead of wishing them away. The Federal Government and private funders have recently put increased emphasis on RCTs as a way of separating ineffective programs from effective ones, but technique is only a tool and never a substitute for the business of trying to understand the world.

2 thoughts on “The Coin Flip

  1. Great article! Thanks!

    Meanwhile, Econ Nobel Prize winner James Heckman is still making his absurd claims about how we’d get a huge ROI by having the government raise the children of America’s worst mothers, starting at birth, with the calculations based on the dubious results of the Perry Preschool and Abecedarian programs from more than 40 years ago.

    Invest in Early Childhood Development: Reduce Deficits, Strengthen the Economy

    Your article makes Heckman’s arguments look even more laughable. “Reduce Deficits! Strengthen the Economy!” LOL.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s