Value-Added Measurement Failed — And Here's a Fundamental Reason Why
In this video, Dr. Justin Baeder discusses why Value-Added Measurement (VAM) for teacher evaluation failed, focusing on the fundamental problem that we can't isolate teacher impact from other variables.
Key Takeaways
- Teacher impact can't be isolated - Student achievement reflects too many factors to attribute gains or losses to a single teacher
- VAM was built on a flawed premise - The assumption that statistical models could separate teacher effects from everything else was never sound
- We need better evaluation approaches - Evaluating what teachers actually do in classrooms is more reliable than trying to measure their statistical impact
Transcript
teacher performance is not a real thing that can be measured and i want to talk about some of the science behind this about 15 years ago the gates foundation started putting a lot of money into what's now called value-added assessment of teachers looking at student test scores and saying okay can we evaluate the teacher based on student test scores and what they found out they really didn't want to say this directly but they released a report in 2013 that basically admitted No, the answer to that is no, we cannot measure teacher performance through student test scores.
And there are a couple of reasons for that.
One is that it's not a stable construct, right?
You get the same teacher, and you look at different classes during the same year, you look at the same teacher over time, and you don't get any kind of stable number that allows you to really say with any confidence, yes, this number represents this teacher's performance.
I do believe that we can rate and evaluate any professional.
I mean, we have to do that.
But the idea that there's some sort of number that you can derive from student test scores and use that to rate teachers just does not hold up.
It's not a property of teachers.
Like imagine if someone said, we're going to measure your height.
And you said, oh, how are you going to measure my height?
Are you going to get a tape measure out or what are you going to do?
And they said, well, no, we're going to measure the height and favorite food of all of your students.
And then we're going to make an average based on that.
you would say, what are you talking about?
That's not a measure of my height.
That's something about students and like, it doesn't even make sense.
What are you talking about?
Well, that's kind of how VAM works.
That's how value added assessment works.
And there were some cases even where like PE teachers would be evaluated with a VAM score that was based on how the kids did in their language arts class.
Like all these things that didn't make any sense, But the fundamental reality that underlay all of these failures was that teacher performance is not a real thing at the heart of it.
And the main scientific reason that teacher performance is not a real thing is the lack of random assignment.
And that happens at two levels.
First, students are not assigned randomly to teachers within a school.
And you know this if you've ever participated in student placement.
You have to balance every class, right?
If you do random assignment, In schools, you will get some classes that are just lopsided because of the randomness.
And you think, I would never want to teach that class.
We're going to move people around so we have good classes.
So every school does that.
Random assignment is not the reality.
So any kind of sampling that we do, any kind of averaging that we do, has to keep that in mind that these are not real averages because there's no random assignment.
The second level at which we don't have random assignment is to schools.
So this is not even a problem that schools can fix by doing random assignment within their school.
Your students were not randomly assigned to your school.
So the idea that there is any kind of fairness to how students are assigned just goes out the window.
We cannot average student test scores and say anything about the teacher if we don't have random assignment.
So let me know what you think and what you're hearing these days about that kind of approach to rating and evaluating teachers.