Diagnostic Errors: Central to Patient Safety, Yet Still In the Periphery of Safety’s Radar Screen

In 2008, I gave the keynote address at the first “Diagnostic Errors in Medicine” conference, sponsored by the Agency for Healthcare Research and Quality (AHRQ). The meeting was filled with people from a wide variety of disciplines, including clinical medicine, education, risk management, cognitive science, and informatics, all passionate about making diagnosis safer. The atmosphere was electric. My lecture was Screen Shot 2013-10-08 at 11.19.29 PMentitled, “Why diagnostic errors don’t get any respect” (I wrote up the speech in my blog and a Health Affairs article, shown)

My talk was, admittedly, a downer. Highlighting the fact that diagnostic errors are arguably the most important patient safety hazard (they accounted for 17% of the adverse events in the famous Harvard Medical Practice Study and are usually the number one cause of harm in malpractice cases), I pointed out that from the very start of the patient safety field, relatively little attention had been paid to them. One tangible manifestation: the term “medication errors” is mentioned 70 times in the Institute of Medicine’s seminal “To Err is Human” report, while the term “diagnostic errors” comes up only twice.

I was pleased to be invited back to give a keynote at this year’s sixth annual conference, which took place in Chicago in mid-September. The landscape has changed significantly since that first meeting. Leaders have emerged: Mark Graber, Hardeep Singh, Pat Croskerry, Gordy Schiff, Eta Berner, David Newman-Toker, and others. Graber and colleagues launched the Society to Improve Diagnosis in Medicine (SIDM), which will soon premier a new journal, Diagnosis. There have been many publications in both the medical and lay (such as here and here) literature related to topics like heuristics and metacognition, subjects that were previously deemed wonky and arcane. I knew we were making progress when, about two years ago, in our UCSF Department of Medicine M&M conference, one of the residents began discussing a complex patient admitted through the emergency department and said “I’d be worried about this being pulmonary embolism, but I’d also be concerned that I’d be falling into the trap of an anchoring error.” I nearly applauded.

And there’s more progress to celebrate. Several promising papers have described innovations such as using diagnostic trigger tools and patient-reported outcomes to measure the frequency of diagnostic errors. AHRQ has encouraged research in this area, and a search on AHRQ PSNet shows that 471 studies have addressed the question of diagnostic errors, a market uptick from the early days of the patient safety field.  New computer tools, such as IBM’s Watson for Health and Isabel, are getting better; it is no longer a pipe dream to believe that computers will help doctors be better diagnosticians in the next couple of years, and may even replace doctors as diagnosticians, at least in straightforward cases, within a decade. Studies by Singh and others have reminded us that while a Watson may be the sexiest use of IT to improve diagnosis, computers are already helping improve diagnostic accuracy in more mundane ways: by making key information, such as laboratory, x-ray, or pathology results, available to the clinician who needs them at the diagnostic moment of truth.

In other words, the issue of diagnostic errors is beginning to get the attention it deserves. And yet, with all of this progress, I can’t honestly report that my talk was much more optimistic than the one I delivered six years earlier. Yes, diagnostic errors have climbed onto the patient safety radar screen, but they’re out in the periphery, blinking a pale glow compared to the more centrally located shining stars (like checklists and CPOE) that capture everyone’s attention.

In my talk, I traced the timeline of the patient safety field since the IOM report’s publication in 2000, highlighting some of the key policy advances such as residency duty hours limits, the CLABSI and surgical checklist movements, the National Quality Forum’s “Never Events” list, and Medicare’s public reporting of safety-related processes and outcomes and recent launch of value-based purchasing. I pointed out that virtually none of these policy initiatives – which have finally created a business for safety, at least in hospitals – have focused on diagnostic errors. For example, none of the 29 serious preventable events on the NQF list – events that must now be reported to the majority of U.S. states – relate to diagnostic errors. Similarly, none of the publicly reported measures on Medicare’s Hospital Compare website, nor any of components of value-based purchasing, relate to diagnostic accuracy. Here’s how I ended my 2010 Health Affairs article:

As one vivid example of how far we need to go, a hospital today could meet the standards of a high-quality organization and be rewarded through public reporting and pay-for-performance initiatives for giving all of its patients diagnosed with heart failure, pneumonia, and heart attack the correct, evidence-based, and prompt care – even if every one of the diagnoses was wrong.

Sadly, this statement remains true today.

This might not make me feel so badly if I were a proceduralist. But as a general internist and hospitalist, most of what I do for a living is to try to diagnose patients correctly. The healthcare world has only so much time, money, and attention. To the degree that that the safety and quality fields turn their back on diagnostic accuracy, so too will healthcare system leaders, deans and program directors, and practicing physicians.

Of course, one of the main problems remains the absence of a feasible, credible measure of diagnostic accuracy – something that could go toe to toe with measures such as rates of readmissions, central line infections, hand hygiene, or pressure ulcers. During an early morning brainstorming session in Chicago with many of the field’s leaders, I sensed a passionate, nearly frenetic, interest in trying to find even a single plausible measure of diagnostic expertise that could be pitched to the National Quality Forum for endorsement and Medicare for public reporting and payment policy. Among the ideas floated: documenting whether a differential diagnosis was recorded, whether patients’ admitting and discharge diagnoses were different, or asking patients whether they had been victims of a diagnostic error on their post-hospital or -clinic survey.

While I understand the desperation, I counseled, both during that morning session and in my keynote speech, that placing a bad diagnosis measure in the public reporting and pay-for-performance worlds would be worse than having no measure at all. While the desire to be on the Centers for Medicare & Medicaid Services’ (CMS) radar screen is understandable, until diagnostic errors have a credible structure, process, or outcome measure, I believe that Medicare should not be the first place to look – it should be the last.

There may well come a day when a tool such as Isabel has been proven sufficiently beneficial that having it as a structural proxy for diagnostic accuracy (or at least for the commitment to improve diagnosis) would be a good idea. Similarly, we may ultimately find that certain triggers (perhaps a change in admission to discharge, or preop to postoperative, diagnosis) are useful measures of diagnostic accuracy. Or that other triggers, such as readmissions or deaths in patients with low predicted mortality, can lead to chart reviews that reveal diagnostic errors.

But until that day arrives, I would be looking to other organizations to promote the diagnosis agenda. Obviously I’m biased here (as last year’s ABIM chair), but it seems to me that the soon-to-be-launched program of continuous Maintenance of Certification (MOC) holds great promise as a way of measuring whether physicians are keeping up with the literature and capable of the analytic work needed to diagnose patients correctly; MOC also has the advantage of being specialty-specific. In addition, accrediting organizations such as the Accreditation Council for Graduate Medical Education (ACGME) or the Joint Commission (TJC) could build into their assessments a review of whether hospitals and training programs are putting sufficient energy into the problem of diagnostic errors. For example, what if TJC required hospitals and ACGME required residency programs to prove that clinicians receive feedback regarding their patients who later were found to have different diagnoses. (Very few programs have systematized this, so clinicians who misdiagnose patients who end up returning to the hospital or ED often never hear about it.) Or that physicians participate in discussions of diagnostic errors at M&Ms and another appropriate forum. Or that healthcare organizations demonstrate that their information technology systems include modalities to try to support diagnosis (perhaps electronic textbooks like UpToDate or AccessMedicine, or decision support tools like Isabel). Of course, in the absence of a hard endpoint, there is a possibility that such measures will be applied arbitrarily by accreditors or be “gamed” by clinicians or leaders. But I think that risk is outweighed by the benefit of pushing institutions to focus on diagnosis and innovate on both measures and solutions.

Of course, we need better measures of diagnostic accuracy, and evidence-based interventions proven to help us reach the right diagnoses. To have any hope of cracking these nuts, we need far more research, and a more secure research funding stream. But until we have these things, let’s focus on the leverage we do have – through hospital and training program accreditors, and the MOC process.

Three years ago, I wrote a New England Journal article with TJC president Mark Chassin and other TJC staffers in which we called for a high bar for “accountability measures,” ones used in public reporting and P4P. Let’s not let our passion for promoting accurate diagnosis, or our impatience, cause us to lower that bar. Doing so would be a short-term win but a long-term loss.

14 Responses to “Diagnostic Errors: Central to Patient Safety, Yet Still In the Periphery of Safety’s Radar Screen”

  1. wrs October 9, 2013 at 2:53 pm #

    You can either have excuses or get results–but you can’t have both.

    Several years ago, our integrated system recognized the limitations of the NQF’s Serious Reportable Events and The Joint Commission’s Sentinel Events. Some of our leaders added a few additional events, one which is: “Death or Serious Disability After a Missed Diagnosis”. This is a Board level report that triggers a full event response–immediate actions including a full Root Cause Analysis, identification of causative factors, action plans, and sharing across the system.

    Next week, I will be presenting our monthly events to the Board. It’s a solemn and somber experience. This month, half of our new events are diagnostic errors.

    We certainly don’t have all the answers when it comes to diagnostic error, nor do we have all of the solutions. We have changed the culture by identifying Diagnostic Error as a system failure until proven otherwise. In doing so, we’ve provided visibility to a closeted issue, begun the long journey of learning, and most important, started to heal our patients and staff for many events that seem frankly incomprehensible.

    You can wait around for Superman or Godot or you can control your own destiny. I strongly advocate the creation of a structure within your own healthcare systems to learn, share, prevent harm, and heal.

  2. Bob Wachter October 9, 2013 at 3:05 pm #

    I think the above comment is incredibly thoughtful and important — and represents the kind of bottom up innovation that we need in this area. It will be from just this kind of work that appropriate measures and solutions are ultimately identified. Thanks.

    – Bob

  3. Eric Thomas October 9, 2013 at 4:18 pm #

    Bob,

    Very nice overview of the state of dianostic error research and policy. I especially commend your concluding paragraph and suggest that everyone read – and re-read – the NEJM paper on accountability measures. It will be a waste of time and harmful to aim for anything less for diagnostic errors. What to do in the meantime? More research, and healthcare organizations need to work on cultural change to facillitate learning and improvement led by frontline providers.

    Best,

    Eric Thomas

  4. Stephen Borstelmann October 9, 2013 at 9:06 pm #

    This is a difficult subject, which you have handled well.

    As we are in our infancy in quality measurement, substituting proxys for quality for well-designed quality measures themselves, we are probably well before that in diagnostic accuracy.

    Perhaps the largest bar to diagnostic accuracy is the high kappa values between individual practitioners, reflecting biases seen in training, regionality, and individual practice experience.

    But equally vexing (which you address) is the lack of opportunity gained by incomplete follow-up, which impacts physicians when a patient presents to another physician/clinic/hospital/ER for the same problem that they were treated for elsewhere. The physician will rarely have the opportunity to learn from their mistake in that circumstance, outside of a well-designed regional EHR with pro-active triggers, or a chance encounter with a colleague.

    I do agree with you that no measurement is better than a bad measurement.

  5. Byron Berry October 9, 2013 at 10:27 pm #

    In-depth and well-addressed. I think that you hit it–mentioning the importance of correct diagnoses to…EVERYONE. If our doctors don’t accept the responsibility of admitting mistakes or incorrect diagnosis to a patient, we won’t move forward in finding out useful tips for diagnosis in new patients…and the cycle will only correct itself after a few brave doctors admit having made a mistake based on inaccurate, but widely-assumed information. I just read a blog post on 33charts.com (I believe) that addressed the importance of “translating” the best descriptions so that all doctors could pool information in one location and pull it when necessary–for clarification purposes, of course. Great article, Bob!

  6. Menoalittle October 10, 2013 at 3:15 am #

    Bob,

    Superb accounting of the history and current significance of diagnostic accuracy. While you state that HIT may improve diagnostic accuracy’ but with that clinical decision support and electronic ordering come new errors and patient neglect. There has been an uptick in unexpected deaths from HIT run care, but since there is not any after market surveillance in place, and no one wants to admit the dangers of these devices, this goes unnoticed.

    How is it possible, if not for the distraction of the health care professionals hell bent on charting in the grids and entering orders on clunky EHRs, can a patient disappear unnoticed?

    http://www.sfgate.com/bayarea/article/Body-in-SF-General-stairwell-IDd-as-missing-4881978.php

    This happened at UPMC in Pittsburgh when a woman wandered to the roof and froze to death in December 2008 shortly after they went live withj user unfriendly CPOE machines:

    http://www.questia.com/library/1P2-19587305/upmc-montefiore-patient-found-dead-on-roof

    I do not want to sound facetious but is it not shocking that with all of the $ millions you spent on HIT, the diagnosis that a patient was missing was missed?

    Best regards,

    Menoalittle

  7. Nancy, RN October 10, 2013 at 1:00 pm #

    Working as a clinical adjunct, the diagnostic error comes from data sliding through the “proverbial cracks”.

    The chain of diagnoses emanating from a low transferrin saturation or proteinuria is large, yet clinicians are missing these data points.

    Sad to say but I find the monotony of the presentations on the EHRs to be most responsible for data being missed in addition to the failure of these systems to notify the users with a beep or something that new data has arrived to the data system device.

    The links above are particularly disconcerting. They show a gross failure of the hospitals’ infrastructure and what happens when there is 100% reliance on electronic systems of care, communication, and surveillance, that can not be trusted.

    Good luck with the investigations.

  8. Arvind Cavale October 10, 2013 at 4:49 pm #

    To Err is Human….is an age-old saying. Not without merit. However. it should be every clinician’s goal to minimize it in practice. Also important to realize that the cause of error is different for different people. So, to say that a systemic change will somehow magically erase error is simplistic and erroneous.

    Similarly, connecting MOC or computer-based decision-support to reducing errors is also a pie-in-the sky approach. The best remedy for a clinician to reduce error is availability of adequate time to assess a patient and get to know the individual. Both these aspects have been eroded as a result of being forced to see more patients in less time, and having to pay attention to extraneous activity such as checking boxes in an EMR chart in order to satisfy “Meaningful Use” and other criteria. Similarly, the advent of consolidation resulting in employed physicians, has result of disruption of the one-one patient-physician connection. It is not use writing blogs, without addressing these fundamental issues. No amount of test-taking will remedy these deficiencies.

    • Jason Maude October 11, 2013 at 9:35 am #

      Lack of time may be a contributory factor and that is an interesting and important research topic. However, a recent excellent study by Hardeep Singh and others in JAMA looked at physicians’ diagnostic accuracy and confidence and did not find a lack of time to be an issue: http://archinte.jamanetwork.com/article.aspx?articleid=1731967

      The results were rather shocking with an average accuracy rate of just 31%; 55% for the easy cases and just 6% for the hard ones. However, lack of time was NOT a factor as the physicians did not request additional resources as they were reasonably confident (6.4/10 for the hard cases) that they were correct.

      Computer based decision support tools are not ‘pie in the sky’ as they are already being used by many high profile institutions and thousands of physicians. Even master diagnosticians such as Gurpreet Dhaliwal use them as a second check as revealed in the New York Times http://www.nytimes.com/2012/12/04/health/quest-to-eliminate-diagnostic-lapses.html?_r=0

      Rather these decision support tools buy you time to work up a comprehensive differential diagnosis in a matter of seconds or minutes.

  9. Zebra Doc October 12, 2013 at 1:08 pm #

    These CDS tools tend to run up the bill by directing docs to order tests taht are superfluous. As they say, do not look for zebras when you see and hear a herd of horses.

    Diagnostic expertise requires a vivid imagination and creativity. I find that both of these traits are compromised by the rigidity and control of the narrative by these new electronic care record and ordering systems that doctors are forced to use.

    I find it disingenuous for the author of this blog to push for interventions in work flow when he does not use them. For instance, BOB, when did you last click click click click and enter all of the orders for a case of multi organ failure due to sepsis?

    If you have not done this, your perspective is skewed.

    • Jason Maude October 14, 2013 at 5:37 pm #

      I am not sure which CDS tools Zebra Doc has used but these comments certainly do not relate to the Isabel diagnosis decision support tool which is mentioned in the blog and of which I am the founder.

      How these tools are introduced is clearly important. If users are told to order tests for each diagnosis suggested in a list of possibilities then they will order more tests.
      However, Isabel acts as a checklist and helps quickly focus the clinician’s thinking BEFORE ordering tests so that tests can be ordered only for what is strongly suspected first.

      In the 12 years that Isabel has been available no institution has reported back to us that test ordering has increased as a result and certainly no institution has stopped using it for this reason. Early validation studies did look at this as a potential unintended consequence but only showed a very small and appropriate increase in tests ordered.

      Diagnostic decision support tools such as Isabel are specifically designed to stimulate thinking and not force clinicians down a certain path too early. Reliable and repeatable diagnostic expertise depends on the clinician being able to draw on a deep knowledge base in order to build a good differential. Tools like Isabel help all clinicians do this quickly rather than just those master diagnosticians with an encyclopaedic memory.

  10. Doron Weber November 17, 2013 at 8:15 pm #

    This is one of the best articles I’ve read from a physician on the problem of diagnostic medical errors, one of which claimed my 16 year old son’s life, a story I recount in Immortal Bird: A Family Memoir. While my book is being used as a study guide by physicians who wish to avoid similar tragedies (http://www.psychologytoday.com/blog/the-guest-room/201305/can-physicians-learn-their-mistakes-and-self-correct) and while many of the problems that need to be addressed are systemic(http://immortalbirdpostscript.wordpress.com) the most difficult part for doctors is to admit when one of their own is sub-standard and repeatedly fails in his/her professional duty. This is not the same as a one-time slip-up, which can happen to anyone. Until the medical profession is ready to identify and disqualify physicians who lack the requisite competence–to track performance, assess outcomes, and apportion blame where blame is warranted– yes, to throw out the bad apples–it will never effectively deal with the growing epidemic of preventable medical errors. This is one more reason why we need to hear more from the patient voice, and not just from the physicians, no matter how enlightened the physicians are–and this is a case of near-ideal physician enlightenment, so kudos to Dr. Wachter!

  11. David Shin December 14, 2013 at 8:44 am #

    There r 2 main factors at play here – gorillas in our midshts, we refse to acknowledge.

    One is Financial Interest and Profesdional Jealousy. And the other is the dependence on parameters that are NOT pertinent to EVERYDAY MEDICINE, but only to MORBIFITY AND MORTALITY, which ATE CONSEQUENCES OF BAD MEDICINE, not good medicine. A pst-mortem as it were instead of Preventative Measures.

    Due to time constraints, an adeqyate history is NOT taken. The information is oftenvTORALLY or PARTIALLY culled fom the first person encountering – the patient MA or LVN or RN, and sometimes from Old Medical Records.

    I have seen Totally wrong Diagnoses and even Systrms been treated. IF U MAKE THE Wrong DIAGNOSIS, U TREAT THE WRONG CONDITION AND GET ADVERSE OUTCOMES – immediately or later. Readmissions and Death of patients leaving the Parking Lot, after premature dischsrge are more often than acceptable.

    The emphasis of Patient outcomes especially by Medicare, Hedis and Joint Commission is on ICD Codes and Protocols. TOTALLY IRRELEVANT AND MISLEADING ifvthe Diagnosis itsekf is wrong.

    As in Residency days, what’s wrong with grand rounds by Physicians unrelated to the case and presented by the Primaty Care giver???. The main constraint is Professional Jealousy or Malice – a fear that a competing Specialist might take that opportunity to deliberately show up a colleage. Cost could be easily overcome.

    Teaching Hospitals have Professors that are already paid by Medicare subsidies. Discharge Planning Nurses accompanied by a Physician Chairman are already reviewing the charts.

    Have a formal system of Grand Rounds by Tertiary Fscilty Staff on Tertiary and Primary facilities as far actoss Cities or even different towns to Review Charts and confirm the Initial Diagnoses.

    Why bury our mistakes or discover them at autopsies? ??

    Confirm CORRECT DIAGNOSES ON ADMISSION OR FIRST ASSESSMENT, not Post Discharge or Post Mortem, whicgvis how it is practiced now.

  12. David Shin December 14, 2013 at 9:12 am #

    EMR’s and ICD Codes themselves compound the problem. In a bid to stratify reimbursement, emphasis is placed on Diagnoses and NOT Symptoms, wjich is really what the patient comes in for. Shortness of breath co uld be either Cardiac, Respiratory or Psychogenenic. A choice of CHF from a choice of prepopulated Diagnoses and Codes to the 5 th digit is forced on the Clinician, who cannot find an appropriate code that was unnecessary in the past as a worded diagnosis in free form could have bene chosen.. But would be totally wrong if a borderline BNP was found andvthe patient’s priblems were a Pulmonary Embolus or Respiratory in origin.

    CHF Protocoks, diuretics, digitalis r given and the patient is started on ACEI as defined by Medicare CHF Protocols. Looks great on Paper and Outcomes Data, but the patient recovered spontaneously from an Asthmatic Attack or worse collapses at home after discharge from a PE.

    Patient centered Medicine turned into Lab Medicine and now into Protocol and Code Medicine. Computer Professionals r fully aware of GIGO – Garbage In, Garbage Out. A beautiful program that fails its purpose, because the initial data was garbage.

Leave a Reply