Some in our profession have begun to think hard about our future pay and incentives, especially given the vagueness of the recently passed MACRA legislation.
MACRA replaced the SGR and put in place a model of compensation that in theory balances both physician accountability and just rewards for hard work. The two MACRA tracks, the Merit-Based Incentive Payment System (MIPS) or the Alternative Payment Model (APM) are long on promise, however, but short on substance. Read here for more (a brief and outstanding NEJM commentary).
So much needs working out, and first among them requires us to deliver an appropriate means by which to assess physician achievement. We have no functional framework to work off of and no history as a guide. How do you evaluate a provider when few valid instruments to measure performance exist? How do you reward a doctor for value when your metrics do not deliver?
With the above in mind, I jumped on a scorecard recently released by ProPublica. The journalists took on the arduous task of appraising 17,000 U.S. surgeons and ranking them by their adjusted surgical complication rates. The project went very broad; the investigators also assembled a knowledgeable collection of consulting providers to assist, and they made their methods known. The publication raised quite a stir, and as you would expect, folks had a lot to say–and some of it not too nice. Docs pushed back.
Patient advocacy groups and spokespeople, of course, responded with a unified message: “You have had a long time to get your house in order. You did not and here is your report card. Now go fix.” Even with the criticism and alleged data flaws, notable physicians also defended the effort:
Because the choice wasn’t between building the perfect report card and building the one they did. The choice was between building their imperfect report card and leaving folks like Bobby with nothing. In that light, the report card looks pretty good. Maybe not for experts, but for Bobby.
I bring the above for you to consider because parties both inside and outside of medicine have begun to move. Whatever data we have, suboptimal or not, will get aggregated, analyzed, and coiffed, and like it or not, will be the standard against we measure ourselves and adapt.
But then something excited my attention. RAND just released the following: A Methodological Critique of the ProPublica Surgeon Scorecard. Some might call it a critique, but in my book, it’s close to an evisceration:
The study authors had problems with data definitions, risk adjustment, and statistical methods. A flawed, but noble effort, but not one on which we want to base our surgical choices, and certainly not one on which to judge doctors for performance and compensation.
I am glad to see the appraisal, and not out of spite. I welcome the ProPublica effort (in fact, I contribute to ProPublica). However, for those of us who receive ratings, who know what the measurements mean and how they translate–the RAND publication serves as a valuable reminder. We want it all right now. However, in our haste to implement a system to grade docs and consume data, our eyes might be a bit larger than our stomachs. For the moment, we need a more generous serving of caveats, buyer bewares and speed bumps–and far fewer top ten lists and red, yellow & green flags.
This is nice to see. Of course, those who are broadly critical of our profession won’t care. The goals are indeed laudable, but I strongly suspected the validity of the results when I found myself listed as performing a number of TKA’s and lap choly’s.
The whole quality measurement/ performance is somewhat backwards from where I sit, but I’ll spare you the long harangue. I think most of us are open to constructive criticism, but statistics base on who-knows-what data offer little opportunity to improve one’s practice or examine our weaknesses.
I agree 100% that ProPublica’s Surgeon Scorecard is, in short, kind of a mess. It is, however, at least a way to start a conversation about measuring outcomes in a way that patients, not just number-crunchers in medical administration, can see and understand.
This is why I do what I do, when I’m not covering hospital medicine as a journo – I’m working my fingers to the bone to try to build literacy on the part of what I call gen-pop (the average human) on how to collaborate with their clinical team(s) on their healthcare, and how to understand if they’re hearing science or woo-woo when assessing treatment options.
Together – smart clinicians and savvy e-patients – we just might have a chance at building a better healthcare delivery system. Including working together to improve PP’s scorecard idea.