The Iron Law of Evaluation (Rossi, 1987) is that the expected value of any net impact assessment of any large scale social program is zero.
- The Iron Law of Evaluation:
The expected value of any net impact assessment of any large scale social program is zero.The Iron Law arises from the experience that few impact assessments of large scale2 social programs have found that the programs in question had any net impact. The law also means that, based on the evaluation efforts of the last twenty years, the best a priori estimate of the net impact assessment of any program is zero, i.e., that the program will have no effect.
- The Stainless Steel Law of Evaluation:
The better designed the impact assessment of a social program, the more likely is the resulting estimate of net impact to be zero.
This law means that the more technically rigorous the net impact assessment, the more likely are its results to be zero – or not effect. Specifically, this law implies that estimating net impacts through randomized controlled experiments, the avowedly best approach to estimating net impacts, is more likely to show zero effects than other less rigorous approaches. [pg5]
- The Brass Law of Evaluation:
The more social programs are designed to change individuals, the more likely the net impact of the program will be zero.
This law means that social programs designed to rehabilitate individuals by changing them in some way or another are more likely to fail. The Brass Law may appear to be redundant since all programs, including those designed to deal with individuals, are covered by the Iron Law. This redundancy is intended to emphasize the especially difficult task in designing and implementing effective programs that are designed to rehabilitate individuals.
- The Zinc Law of Evaluation:
Only those programs that are likely to fail are evaluated.
Of course, if the expected value is zero, that could just mean that there are as many negatives as positives. This is more likely to occur, of course, in preregistered randomized trials, where the researchers have fewer degrees of freedom to tiptoe through the tulips in the garden of forking paths. (This is a version of the “stainless steel law” of policy evaluation, above.) It also is more likely to happen when researchers’ incentives are to appear (or to be) unbiased, rather than to support a particular intervention. Carol Dweck is never, ever going to discover that growth mindset is actively harmful, no matter what trial she runs or what outcome she examines, and even though her interventions are almost comical examples of the “Brass Law” above of interventions trying to change individuals’ behaviors and thoughts. But it does happen.
For example, the federal government’s largest-scale randomized trial of labor programs for assisting low-income people in the 2000s, the Employment, Retention, and Advancement program, found statistically significant positive effects for three programs, statistically significant negative effects for two programs, and null effects for another seven. The evaluation of Building Strong Families, the Bush Administration’s >800 million dollar initiative to encourage marriage among low-income parents, found null effects in six programs, positive effects in one in Oklahoma, and strong negative effects in another in Baltimore, as well as statistically significant negative impacts on two outcomes when the eight programs were aggregated together.
Why does this happen? The simple answer is that in a largely-rich, largely-free country, with many existing (if confusing) private and public supports for low-income people, it’s just as easy to screw things up than to make things better, no matter how much you spend.