In today’s Annals of Internal Medicine, my colleagues and I describe the saga of the four-hour measure of door-to-antibiotics time for pneumonia – the first truly dangerous measure in the era of public quality reporting. It is an important cautionary tale.
As I’ve discussed previously, the biggest surprise of the last decade in the quality field has been this: public reporting alone (even without pay-for-performance) leads to huge changes in the behavior of providers and healthcare organizations… even though there is virtually no evidence that patients are reading or acting on the reports.
In other words, shame and pride are powerful forces for change.
That’s great, but it means that the stakes are high – those developing and promoting publicly reported quality measures have to get it right, since front line folks are likely to respond to them by changing their clinical practice.
I won’t go through all the flaws in the Time to First Antibiotics Dose (TFAD) measure, or how it came into being – I hope you’ll take a few minutes to read the article. But the bottom line is that a measure with weak evidence was implemented without any field-testing or input from the impacted physicians (mainly ED docs). As we (my coauthors are Chris Fee, an ED faculty member at UCSF, Scott Flanders, my old colleague who now runs the University of Michigan’s hospitalist program and is an expert on pneumonia, and the ubiquitous Peter Pronovost) wrote in the Annals article,
In the days before measurement of TFAD, patients with uncertain diagnoses would continue to be evaluated until the diagnosis was clarified. However, the TFAD standard completely transformed the dynamic: Faced with a patient who might have pneumonia, the emergency medicine physician now has a strong incentive (almost always buttressed by social pressure and sometimes by financial incentives) to give antibiotics before 4 hours have passed, even when he or she is still unsure of the diagnosis.
We go on to highlight the lessons to be learned from this failed measure – I say failed because multiple studies (for example, here and here) have documented that thousands of patients have received antibiotics for what ultimately proved to be heart failure, pulmonary embolism, or other non-infectious illnesses. These lessons include:
- First, results from studies of patients with known diagnoses should be extrapolated cautiously, if at all, to patients who lack a diagnosis.
- Second, or some measures, “bands” of performance (i.e., 80-100% adherence) may make more sense than “all-or-nothing” expectations.
- Third, representative end users of quality measures (in this case, ED docs and hospitalists) should participate in measure development.
- Fourth, quality measurement and reporting programs should build in mechanisms to reassess measures over time. In this case, CMS and the Joint Commission are to be praised for listening to the chorus of criticism: in response, the measure has been revised from a 4-hour to a 6-hour standard. Even though a 6-hour TFAD rule is still not evidence-based, it should cause less harm.
- Finally, biases, both financial and intellectual, that may influence quality measure development should be minimized. The TFAD measure was proposed and endorsed by many of the same people who conducted the foundational studies. None of us can be completely unbiased when evaluating our own research results.
The problems with the TFAD measure bring us back to the larger debate about the role of evidence in quality and safety measures. I believe the TFAD story supports our contention that “commonsensical” but unproven measures can actually cause harm.
Now, before I get toooo critical, it is worth remembering that a generation ago, we didn’t believe that peptic ulcer disease was caused by an infection; now we know it to be so. Perhaps a generation from now we’ll learn that antibiotics really were appropriate treatment for all those patients with heart failure and PE.
But I don’t think so. Instead, a flawed quality measure led to a lot of unnecessary antibiotic treatment (along with its accompanying risk of side effects, promotion of drug resistance, and C. diff). That’s a bad thing.
Just as clinicians need to learn from their mistakes in their patient care, so too should policy makers. In that spirit, I like to think of the TFAD measure as a patient safety policy Sentinel Event. The Annals article is our attempt at a root cause analysis and action plan.
I’d welcome any feedback on the article or the issues it addresses.
Bob,
Congratulations on your presentation and publication. Such treatment guidelines, measures, and rules result in cognitive disruption and paralysis similar to that attributed to non-evidence based HIT by Hartzband and Groopman (NEJM April, 2008). They also promote the erroneous diagnosis and vindicate the diagnostic errors you described in your post of June 2. It is widely assumed by the politicians, bureaucrats, HIT device makers, and the C-suite folks that doctors have lost their ability to think and need the rigidity of non-evidence based CPOE and the decision support of HIT to “guide” (control) their clinical judgment. By virtue of non-evidence based requirements such as TFAD, physician intelligence is truly becoming artificial.
Best regards,
Menoalittle
Dear Dr. Wachter, Extremely well said, and I would second the comments of Menoalittle. It is most distressing that over the last few decades the integrity of our clinical care has lost the confidence of the public, including voters and legislators. To regain their trust (and more cynically, their funding) we adhere to measures that, as you point out, may not help more than harm; furthermore, the processes of collecting such data cost so much more money and require so much time that it makes care more costly and less efficient, and clinicians less directly involved with patients. The resulting image of an uncaring and expensive care system only makes the public that much more skeptical of our motivation to truly provide good care, and makes them that much more likely to pursue the (oxymoronic) goal of quantifying the quality of what we do.
The retrospectoscope is, of course, a near-perfect instrument, but it still seems to me that this chain of events should have been more obvious to those who sat on the committees that implemented the pathway. The diagnosis of pneumonia has always been something folks argued about, and it sometimes takes hours or even days to become obvious. Every clinician knows this. So Bob, how did this one get by? Do you have any more insider’s viewpoint on this? Was it only, as you point out in the article, that there were no people on the decision committees who work in the front lines seeing these patients as they come through the door?
Thanks, Chris. I know a few folks on the committee — highly qualified, competent people, by and large. I just think that the world looks very different to an ID or pulmonary specialist, who are only called to see pneumonia patients with obvious diagnoses (and often complex courses that sometimes appear to be attributable to late initiation of therapy) than to a “first responders.” And, to a clinical researcher, the diagnosis of pneumonia probably seems like a no brainer — I’m not sure the diagnostic challenges faced by the ED physician would be intuitively obvious to a non-clinician.
Live and learn…
Hi Bob Wachter,
Thank you for sharing this article. I really love the content of it.