A variety of different metrics are often used to estimate how well an objective is being achieved, when measuring the objective directly is difficult or impossible. For example, a school district’s district may be maximizing the number of students proficient in reading. Since reading skill can’t be observed directly through brain scans, many school districts use standardized tests to measure it. Obviously, what people actually care about is how well the students can read, not how well they can fill out a test. But tests are used to try to measure how well schools are doing at teaching.

The problem arises when people (or whole institutions) start to target the metric itself, instead of the real objective. You are probably aware of the phenomenon of ‘teaching to the test’: this is when teaching is at least partially directed towards getting students to perform better on the test, instead of solely on developing the actual skills. Directly targeting the metric itself hurts the achievement of the actual objective, and furthermore the metric will no longer be accurately estimating how well the actual objective is being achieved. Goodhart’s law (in its most common wording) sums this up:

“When a measure becomes a target, it ceases to be a good measure."

Here’s a visual example demonstrating this phenomenon. You can think of targeting an objective like throwing darts, aiming to get as close as possible to the green target. However, you don’t learn how close your dart actually lands to the objective, you only learn how close it landed to this other blue target i.e. the value of the metric. Here the red X marks where your dart happened to land, and it’s a distance of 20 away from the blue target and 30 from the green target.

image.png

Now, as long as you continue to only aim for the green target, the distance to the blue target will give an imprecise but probably still helpful estimate of how close you were. But if you started aiming directly for the blue target, this causes:

  1. Your distance to the green target to probably increase
  2. Your distance to the blue target to be a less accurate estimate of distance to the green target

Why is this general problem so common? Firstly, and this is my own speculation here, is just human nature. When people work on something for a long time and receive only limited feedback on their performance (in the form of a metric such as their students’ test scores), the line between the metric and the actual objective can slowly blur. If you get a hit of psychological reward whenever you get a low distance to the blue target, then you can slowly drift towards aiming right for the blue target. That’s why it’s helpful to take a mental step back from time to time and reorient yourself with whatever your real objective is.

The second and probably more important reason, is that people (or institutions) are often incentivized by poorly designed rules to target the metric instead of the objective. Imagine if you were told to aim for the green target, but paid based on how close you got the blue target. As a real world example, when school funding is dependent on test scores then it shouldn’t be a surprise when schools start teaching to the test. This is just another example of how it’s so important for incentives to be well-designed. Keep in mind that this problem cannot always be avoided entirely with smarter incentives: if there aren’t any perfect metrics then there may be no perfect way to reward performance. But we don’t aim for perfect, we aim for as good as possible.

Back to Pages