« my stump speech | Main | video from the Service Nation summit »

September 11, 2008

value-judgments in testing

It would be possible to create a valid and reliable test of the 10 greatest virtues of Saddam Hussein. Those virtues could even be facts about him: for example, that he was unafraid to die. Such a test would be morally worse--really worse, not just worse in my opinion--than a test of students' understanding of the First Amendment.

I write this to try to shake people's confidence in a prevalent theory about research, evaluation, assessment, testing, and accountability. This theory holds that measurement should be scientific. Everyone knows that evaluators always hold opinions and make value-judgments. But their values are often treated as problematic, as evidence of bias or subjectivity or political agendas. Values should be disclosed, investigated, and minimized: the hold of that positivist theory is strong even decades after it was rejected in philosophy.

The alternative, of course, is to say that when we evaluate, we make value judgments. Some judgments are better than others. Our most important responsibility is to hold good values. Since our value-judgments differ, we'd better discuss them--not just to disclose them and acknowledge our differences, but to reason together about what is right.

I currently serve on a federal test committee. We receive "items" (test questions) written by consultants. We reject some proposed questions on scientific grounds. For instance, when tested in a lab, some items prove to be confusing for reasons unexpected by the writers. That is an empirical finding that should matter. We also rely on scientific expertise to tell us how many questions we need to obtain a reliable measure, how many kids need to be tested to make estimates about populations, and so on.

But ultimately the item-writers choose questions because of their beliefs about what kids should know. They are guided by written standards, which are themselves statements of moral value, albeit rather vague ones. When we on the committee reject questions, it is usually because of our values. For instance, we may say that a topic is trivial. We have expertise, but that really means that we have clawed our way into jobs that allow us to express opinions about what is important. We also decide how difficult each question is. That depends somewhat on empirical evidence about what average kids actually know. But it also essentially depends on what we think they should know.

I don't believe that the irreducibly moral nature of testing and evaluation is a problem. It reflects the irreducibly moral nature of everything that matters in life. Nor is it necessarily a mistake to hire experts and consultants to write tests. We need reasonably independent, experienced, committed referees who can focus intensely on the task of evaluating kids. What is a mistake is to interpret the results of a test as "scientific" or to regard the intrusion of values as "bias" or as "politics." The only alternatives to "politics" are boring homogeneity, spurious objectivity, utter thoughtlessness, or a dictatorship.

September 11, 2008 8:55 AM | category: education policy | Comments


Site Meter