If you have not heard, the NYC Dept. of Education released a report card assessing all of its 18,000 teachers. It is making a lot of noise, particularly here in the city. Read the story, regardless of where you live–it is interesting and a policy exercise that is no doubt, coming to a theater near you.
What is striking, and I allude to it in my title, is the resemblance of the arguments here, as compared to those made by the physician community. The CMS ratings site will ramp up with substantive data in the upcoming decade, and the same fears of inadequate adjustment, sample size, and inaccurate information are front and center:
In simple terms, value-added models use mathematical formulas to predict how a group of students will do on each year’s tests based on their scores from the previous year, while accounting for factors that include race, gender, income level and other test results. If the students surpass expectations, their teacher is rated well — “above average” or “high” under New York’s models. If they fall short, the teacher is rated “below average” or “low.”
What many teachers point out is that the scores cannot account for many other factors: distractions on test day; supportive parents or tutors; allergies; a dog continually barking near the test site. There are also schools where students are taught by more than one teacher, making it hard to discern individual contributions. (The reports released by the city gave the same rankings to those teachers.)
Additionally, the curve and presentation don’t differ much from hospitalcompare.gov, oddly enough:
In New York, the ratings cover teachers in fourth through eighth grades, because of when state tests are given. They are distributed on a curve, so that for 2009-10, 50 percent of teachers were ranked “average”; 20 percent each “above average” and “below average”; and 5 percent each “high” and “low.”
The question is what to make of this. On one hand, transparency is ideal, and on its face, the public clamors for it. On the other, in a rush to judgment, are we promoting tools that are not ready for prime time? My gut tells me the top and bottom 5% “outliers” are there for a reason; assess these folks with rigor and make the appropriate decisions (I assume this is automatic). Otherwise, be damn careful. It does seem ironic that a city official makes the quote below in the context of this publication. It is cautionary, and a good conclusion for the post:
“The purpose of these reports is not to look at any individual score in isolation, ever,” Shael Polakow-Suransky, the No. 2 official in the city’s Education Department, said Friday. “No principal would ever make a decision on this score alone, and we would never invite anyone — parents, reporters, principals, teachers — to draw a conclusion based on this score alone.”
Do you find that credible? Are you watching the news, the primaries, or popular culture play out in the media. Call me a cynic, but I do not believe it. The public loves ratings, including me, but with the imprimatur of a sanctioned distribution, these data just received the “Consumer Reports” stamp of approval. Lets see if teachers (and doctors) perform like toaster ovens.
UPDATE: And here is the
blowback response from the UFT president. A future AMA president is taking notes. 🙂