ICU Glycemic Control: Another Can’t Miss Quality Measure Bites the Dust

A disconcerting pattern has emerged: a blockbuster study finds that a certain practice leads to improved outcomes. Large national organizations codify the practice into a quality measure, forcing widespread adoption. Later studies prove the practice to be unhelpful, perhaps even dangerous.


Think about it – we’ve now seen quality measures that prompted the use of boatloads of unnecessary antibiotics (“door-to-antibiotic time” in patients with suspected community-acquired pneumonia), “can’t miss” quality measures that proved wrong (giving beta blockers to every perioperative patient), quality measures that promote gamesmanship and box-checking as a surrogate for meaningful action (smoking cessation counseling), and quality measures that trade efforts to prevent one kind of harm (preventing falls) for another (tethering some elderly hospitalized patients to their beds, leading to deconditioning and pressure ulcers).

Let’s now add tight glucose control in critically ill patients to the Hall of Hiccups.

A multicenter study of ICU patients in Australia and New Zealand (NICE-SUGAR) in this week’s NEJM demonstrates that tight glucose control – as pathophysiologically sensible as can be (remember those vivid movies of drunken white blood cells vainly trying to breaststroke their way through sugary serum to reach sites of infection) – resulted in significant harm. The study found that driving glucoses down to the range of 81-108 led to a 90-day mortality rate of 27.5% (vs. 24.9% in controls). In other words, intensive insulin therapy was associated with a nearly 15% higher odds of death. The exact mechanism of harm is unclear; some of it was undoubtedly from severe hypoglycemia (272/3016 episodes in the tight control group, vs. 16/3014 in the controls), but there may have been other undefined detrimental effects.

Tight glucose control first hit the charts in 2001, when a single site (in Leuven, Belgium) unblinded study in surgical ICU patients reported a whopping 34% reduction in mortality. The study setting was atypical in many ways: two-thirds of the patients were recovering from CT surgery (and the control post-op patients had an unusually high mortality rate), the nurse-to-patient ratio was 1:1 (vs. 1:2 in most U.S. ICUs), and the nurses were often backed up by a study physician.

Nevertheless, based nearly entirely on this single (and singular) result, tight glucose control became a standard-of-care practice for all ICU patients, and even some non-ICU patients. It was also integrated into “bundles” in other quality initiatives, such as the Surviving Sepsis campaign and the Surgical Care Improvement Program (SCIP).

Perhaps unsurprisingly (given Leuven’s rarified conditions), subsequent studies in other populations (medical ICU patients, for example) were less impressive; most [two nice reviews are here and here] found that tight control yielded no benefit. Two studies [here and here] were even stopped early because of unacceptably high rates of hypoglycemia. And now there’s NICE-SUGAR, whose results, if you’re an intensive insulin fan, are anything but “nice.”

Hypoglycemia isn’t the only risk of intensive insulin therapy; it is also a resource hog. Observing efforts to administer intensive insulin therapy in our UCSF critical care units, our ICU director Mike Gropper noted that

Each glucose determination required 7 minutes of nursing time; a nurse caring for 2 patients on the insulin protocol would spend approximately 2 hours of a 12-hour shift monitoring the patient, obtaining samples, performing tests, and intervening.

In other words, in a 16-bed ICU in which nearly half the patients might have “indications” for intensive insulin therapy, approximately 1 FTE of nursing time is focused on glucose control, around-the-clock. For our 80 ICU beds, that would sum up to about 4-5 FTEs. The yearly cost of this would run into the several millions of dollars (if the nurse time was paid for); the cost might be exacted in patient care if the nurse-to-patient ratio went unchanged but the nurses found the time by doing fewer of their other important tasks.

The bottom line is when we chisel a practice into the Tablet of Quality Measures and bring it down the Mountain, it’s a very big deal – when we get one wrong, we can do an awful lot of harm. And it seems like we’ve gotten quite a few wrong lately.

Although many, including me, have long lamented the glacial pace of adoption of evidence-based practices in American medicine (often 5-15 years after the emergence of truly robust evidence supporting a practice), this traditional time lag did have one virtue: it allowed the literature to mature. Not infrequently, after an early trial showed benefit, later studies – performed under less controlled circumstances by less committed investigators with fewer dedicated resources – found no benefit or even harm. By the time the laggards were ready to consider changing their practice, the practice had been debunked by new evidence.

My friend and colleague Kaveh Shojania wrote a terrific piece on this phenomenon in our web-based safety journal, AHRQ WebM&M. Kaveh described his approach to the single blockbuster study that shows staggering benefit:

No single principle can encapsulate all of the interpretive issues for a body of literature, but nothing works that well comes close. To allow some wiggle room for the discovery of penicillin and other occasional quantum leaps in medical care, we can tone it down a little: most things don’t work that well. The relevant corollary is that any study reporting dramatic improvements in any major clinical outcome is probably flawed. When clinical interventions do work, they tend to bring very modest gains: relative improvements of 20% to 40% are often cause for celebration, and absolute improvements in the 5% to 10% range represent major advances in care. If an article reports improvements in these ranges, scrutinize it closely. If the improvements exceed these ranges, expect subsequent studies to show less impressive effects, or even no benefit.

Today’s environment of quality metrics, public reporting, and P4P has turbo-charged the adoption curve. Just like the 24-hour news cycle sometimes makes us too impatient to give policy initiatives a chance to ripen (see Joe Klein’s thoughtful essay on this viz Obama’s economic policies in last week’s Time magazine), we need to recognize that this shorter cycle – taking a single positive study and mainlining it into a quality standard – risks disseminating some practices that will prove to be unhelpful, or even harmful.

Yes, we need more quality and safety measures to promote improvement. But we don’t need them so badly that we can’t afford to wait until the practices that look good in early studies, in atypical environments, are road tested in more typical settings to ensure that they really work.


  1. menoalittle on March 30, 2009 at 1:13 pm


    Another timely report. The same concept of chiseling “a practice into the Tablet of Quality Measures and bring it down the Mountain,…we can do an awful lot of harm.”
    In the case of HIT and CPOE devices, the harm will be measured economically ($billions) and in morbidity and mortality. The safety and efficacy of these care-altering devices is unknown but dreamed to be the panacea. AHRQ ought to devote resources to establish the truth. The Koppel Commentary in JAMA
    (linked here)

    begins to expose how the HIT vendors collude with the HIT happy corner suite inhabitants with contractual nondisclosure (of patient injury due to HIT) and “hold harmless” (the HIT maker), and blame all patient endangerment, injury, or death on the healthcare professional user (the “learned intermediary”).

    And the government edict is to use these devices to measure results to establish comparative effectiveness policies when the actual impact on care and outcomes of these devices has not been subjected to methodological study…but the industry supported trade group CCHIT offers a “certification”, which is essentially meaningless when it comes to clinical care. Let’s spend $20 billion on nurses instead.

    Is intense glucose control more comparatively effective than doctor directed glucose control individualized for the clinical setting? Will glucose control directed by a CPOE device beget better results and which company’s device will enable better outcomes, if any?

    Best regards,


  2. jnmed on March 30, 2009 at 6:39 pm

    I like the idea of a mandatory “waiting period” for the adoption of new outcomes research into standard practices- it would put medicine on par with the Vatican (canonization) or the Baseball Hall of Fame.

  3. Annie on April 2, 2009 at 4:22 pm

    You touched upon the multidisciplinary aspect of this, and I wonder:

    Are there any formalized mechanisms of pursuing evidence-based practice to integrate the practices of nursing and medicine so that the clinical problems are researched, developed, applied and evaluated using both nursing and medical research?

    Many of the patient outcomes are dependent on the professional judgment, decision-making and patient education, coaching and care coordination of nurses. It seems to me that to look at evidence-based practice and P4P in medicine by segregating out nursing, skews the data and outcomes and may very well punish and reward physicians for aspects of care they do not directly control and influence.

    Likewise, there is a lot of important nursing research which isn’t being applied to patients’ and nurses’ benefit, as joint appointments are rare in nursing, and bench to bedside research application isn’t running at lightning speed. The majority of clinical nurses hold AA degrees (only about 1/3 of nurses are educated at the bacc. level or higher), most aren’t well read in nursing research or have any contact at all with nurse researchers and academicians, and even fewer are adept at applying nursing research to particular patient populations.

    This doesn’t seem to be a sustainable model for evidence-based practice.

    The NYT has a piece about this today. Link at my name.

  4. bratzler on April 10, 2009 at 8:49 pm

    There is no doubt that the implementation of some quality initiatives results in unintended consequences and the potential for patient harm. Despite the fact that performance measures such as those used for public accountability by CMS and The Joint Commission are required to go through the consensus development process at the National Quality Forum, the level of evidence to support individual measures varies. Clearly the implementation of performance measures is even more variable and there is always the potential for unintended consequences – either direct harm to patients through inappropriate application of the measure construct at the patient level, or the indirect harm of focusing on measurement rather than other and perhaps more important quality initiatives. I do know that some private insurers have pushed agendas and measurement strategies that go far beyond some of the national initiatives and some have been driving the inappropriate goal to achieve 100% performance (“perfect care”) on measures that were not designed to exclude all clinical exceptions to the measure specifications.

    That said I am tiring of some of the criticisms related to quality initiatives because the authors of those criticisms often fall victim to the same practices that they criticize. It seems to be increasingly common for opinion pieces, editorials, anecdotal reports, underpowered studies, and single-institution studies to be used to suggest that quality initiatives are resulting in wide spread patient harm. Frankly, I have not seen systematic evidence of that for most national quality initiatives and in some cases, have data to suggest that for many of the conditions targeted in those initiatives, patient outcomes are slowly but progressively improving. I am always cautious to not attribute improvements in outcomes directly to the quality initiatives… but am looking for systematic evidence that an initiative is actually causing harm on a wide scale basis.

    Statements such as “quality measures that prompted the use of boatloads of unnecessary antibiotics” or measures that require “giving beta blockers to every perioperative patient” are increasingly common and popular in the medical and lay press right now. However there is little evidence to support either of these statements and many others that are made. For example, the only systematic study of antibiotic use for pneumonia patients across hundreds of emergency departments showed no change in the pattern of pneumonia diagnosis and no change in rates of antibiotic prescribing for non-pneumonia cases during the time of measure implementation; and the only national measure that I know of related to use of perioperative beta-blockers never required that “all” perioperative patients receive the agents.

    I am now dealing with pushback on the SCIP glucose control measure because of the NICE-SUGAR trial and the commentaries such as this blog. In reality that trial was not relevant to what we measure in the Surgical Care Improvement Project (SCIP). The only “national” performance measure that focuses on glucose control to my knowledge is the SCIP measure that targets severe hyperglycemia in patients who undergo cardiac surgery. Note that the target glucose for the performance measure is defined as 200 mg/dL or less. This level of glucose control is even more liberal than the control group (target 180 mg/dL or less) in the NICE-SUGAR trial. As pointed out in the NICE-SUGAR article, and in the accompanying editorial, there is ample published evidence that severe hyperglycemia is associated with increased morbidity and mortality in a variety of patient groups. As pointed out in the editorial by Inzucchi and Siegel, “it would seem reasonable to continue our attempts to optimize the management of blood glucose in our hospitalized patients, especially to avert the extremes of hyperglycemia….” Again, beyond a blood sugar of 200 mg/dL or less, there are no other requirements for intensive insulin therapy or glucose control in SCIP.

    Some hospitals undoubtedly go beyond the requirements of the SCIP measure and that could result in harm. It is possible that some institutions made the decision on their own to implement protocols of intensive blood sugar control for their intensive care patients. But on a national basis, surgical outcomes actually are improving over time and there is no national requirement to implement programs of intensive blood sugar control. For now, the SCIP performance measure is not one of those quality measures that “bites the dust.”

  5. Bob Wachter on April 16, 2009 at 4:24 pm

    I appreciate all these comments, particularly the one by Dr. Bratzler, one of the nation’s leaders in the development and vetting of quality measures. Rather than addressing the issue here, I’ve devoted a new post to my thoughts on the issues he raises (in the context of several other recent articles attacking the current state of quality measurement).

  6. Charles on January 26, 2010 at 4:49 pm

    I scratched my head when NICE-SUGAR was released because our unit was targeting 90-130 but unlike the study that showed Hypo rates of 6-7% in the tight group, our unit shows less than 1%. Now, I must admit that we have a Computerized IV System we brought in a year ago but how does a study that shows 6-7% Hypo qualify as a legit trial when most hospitals don’t even come close to that. This study did prove that Hypoglycemia is harmful for patients but keeping patients floating around 200 is also harmful as well. It may be time for a trial on results where the patient is kept “in good control” without the hypo versus the NICE-SUGAR which only proved that Hypo increases mortality…duh!

Leave a Comment