The Observability Bias: A Crisis in Instructional Leadership

Our profession is facing a crisis of credibility:

We often don't know good practice when we see it.

Two observers can see the same lesson, and draw very different conclusions. Yet we mischaracterize the nature of the problem.

We think this is a problem of inter-rater reliability. We define it as a calibration issue.

But it's not.

Calibration training—getting administrators to rate the same video clip the same way—won't fix this problem. The crisis runs deeper, because it's a fundamental misunderstanding of the nature of teaching.

See, we have an “observability bias” crisis in our profession.

I don't mean that observers are biased. I mean that we've warped our understanding of teacher practice, so that we pay a great deal of attention to those aspects of teaching that are easily observed and assessed…

…while undervaluing and overlooking the harder-to-observe aspects of teacher practice, like exercising professional judgment.

We pay a great deal of attention to surface-level features of teaching, like whether the objective is written on the board…Yet we don't even bother to ask deeper questions, like “How is this lesson based on what the teacher discovered from students' work yesterday?”

The Danielson Framework is easily the best rubric for understanding teacher practice, because it avoids this bias toward the observable, and doesn't shy away from prioritizing hard-to-observe aspects of practice.

Charlotte Danielson writes:

“Teaching entails expertise; like other professions, professionalism in teaching requires complex decision making in conditions of uncertainty.
…
If one acknowledges, as one must, the cognitive nature of teaching, then conversations about teaching must be about the cognition.”

—Talk About Teaching, pp. 6-7, emphasis in original

When we forget that teaching is, fundamentally, cognition—not a song and dance at the front of the room—we can distort teaching by emphasizing the wrong “look-fors” in our instructional leadership work.

It's exceptionally easy to see this problem in the case of questioning strategies, vis-à-vis Bloom's Taxonomy and Webb's Depth of Knowledge (DoK).

I like Bloom's Taxonomy and DoK. They're great ways to think about the variety of questions we're asking, and to make sure we're asking students to do the right type of intellectual work given our instructional purpose.

But the pervasive bias toward the easily observable has resulted in what we might call “Rigor for Dummies.”

Rigor for Dummies works like this:

If you're asking higher-order questions, you're providing rigorous instruction.
If you're asking factual recall or other lower-level questions, that's not rigorous.

Now, to some readers, this will sound too stupid to be true, but I promise, this is what administrators are telling teachers.

Observability bias at work. It's happening every day, all around the US: Administrators are giving teachers feedback that they need to make their questioning more “rigorous” by asking more higher-order questions, and avoiding DoK-1 questions.

Never mind that neither Bloom nor Webb ever said we should avoid factual-level questions. Never mind that no rigor expert believes factual knowledge is unimportant.

We want rigor, so we ask ourselves “What does rigor look like?” Then, we come up with the most reductive, oversimplified definition of rigor, so we can assess it without ever talking to the teacher.

My friend, this will never work.

We simply cannot understand a teacher's practice without talking with the teacher. Observation alone can't give us true insight into teacher practice.

Why?

Back to Danielson: Because teaching is cognitive work.

It's not just behavior.

It can't be reduced to “look-fors” that you can assess in a drive-by observation and check off on a feedback form.

The Danielson Framework gives us another great example.

Domain 1, Component C, is “Setting Instructional Outcomes.”

(This is a teacher evaluation criterion for at least 40% of teachers in the US.)

How well a teacher sets instructional outcomes is fairly hard to assess based on a single direct observation.

Danielson describes “Proficient” practice in this area as follows:

“Most outcomes represent rigorous and important learning in the discipline and are clear, are written in the form of student learning, and suggest viable methods of assessment. Outcomes reflect several different types of learning and opportunities for coordination, and they are differentiated, in whatever way is needed, for different groups of students.” (Danielson, Framework for Teaching, 2013)

Is that a great definition? Yes!

But it's hard to observe, so we reduce it to something that's easier to document. We reduce it to “Is the learning target written on the board?”

(And if we're really serious, we might also ask that the teacher cite the standards the lesson addresses, and word the objective in student-friendly “I can…” or “We will…” language.)

Don't get me wrong—clarity is great. Letting teachers know exactly what good practice looks like is incredibly helpful—especially if they're struggling.

And for solid teachers to move from good to great, they need a clearly defined growth pathway, describing the next level of excellence.

But let's not be reductive. Let's not squeeze out all the critical cognitive aspects of teaching, just because they're harder for us to observe.

Let's embrace the fact that teaching is complex intellectual work.

Let's accept the reality that to give teachers useful feedback, we can't just observe and fill out a form.

We must have a conversation. We must listen. We must inquire about teachers' invisible thinking, not just their observable behavior.

What do you think?

Are you seeing the same reductive “observability bias” at work in instructional leadership practice?

In what areas of teacher practice? Leave a comment and let me know.

The Observability Bias: A Crisis in Instructional Leadership

Differentiated Instructional Leadership: Developing Teacher Practice Through Autonomy

How Instructional Leaders Change Teacher Practice