Article Type
Changed
Thu, 02/08/2024 - 16:20

When clinicians in a large-scale study viewed a series of digital images that showed skin diseases across skin tones and were asked to make a diagnosis, the accuracy was 38% among dermatologists and 19% among primary care physicians (PCPs). But when decision support from a deep learning system (DLS) was introduced, diagnostic accuracy increased by 33% among dermatologists and 69% among PCPs, results from a multicenter study showed.

However, the researchers found that across all images, diseases in dark skin (Fitzpatrick skin types 5 and 6) were diagnosed less accurately than diseases in light skin (Fitzpatrick skin types 1-4).

These results contribute to an emerging literature on diagnostic accuracy disparities across patient skin tones and present evidence that the diagnostic accuracy of medical professionals on images of dark skin is lower than on images of light skin,” researchers led by Matthew Groh, PhD, of Northwestern University’s Kellogg School of Management, wrote in their study, published online in Nature Medicine.



For the study, 389 board-certified dermatologists and 450 PCPs in 39 countries were presented with 364 images to view spanning 46 skin diseases and asked to submit up to four differential diagnoses. Nearly 80% of the images were of 8 diseases: atopic dermatitis, cutaneous T-cell lymphoma (CTCL), dermatomyositis, lichen planus, Lyme disease, pityriasis rosea, pityriasis rubra pilaris, and secondary syphilis.

Dermatologists and PCPs achieved a diagnostic accuracy of 38% and 19%, respectively, but both groups of clinicians were 4 percentage points less accurate for diagnosis of images of dark skin as compared with light skin. With assistance from DLS decision support, diagnostic accuracy increased by 33% among dermatologists and 69% among primary care physicians. Among dermatologists, DLS support generally increased diagnostic accuracy evenly across skin tones. However, among PCPs, DLS support increased their diagnostic accuracy more in light skin tones than in dark ones.

In the survey component of the study, when the participants were asked, “Do you feel you received sufficient training for diagnosing skin diseases in patients with skin of color (non-white patients)?” 67% of all PCPs and 33% of all dermatologists responded no. “Furthermore, we have found differences in how often BCDs [board-certified dermatologists] and PCPs refer patients with light and dark skin for biopsy,” the authors wrote. “Specifically, for CTCL (a life-threatening disease), we found that both BCDs and PCPs report that they would refer patients for biopsy significantly more often in light skin than dark skin. Moreover, for the common skin diseases atopic dermatitis and pityriasis rosea, we found that BCDs report they would refer patients for biopsy more often in dark skin than light skin, which creates an unnecessary overburden on patients with dark skin.”

In a press release about the study, Dr. Groh emphasized that he and other scientists who investigate human-computer interaction “have to find a way to incorporate underrepresented demographics in our research. That way we will be ready to accurately implement these models in the real world and build AI systems that serve as tools that are designed to avoid the kind of systematic errors we know humans and machines are prone to. Then you can update curricula, you can change norms in different fields and hopefully everyone gets better.”

Dr. Ronald Moy


Ronald Moy, MD, a dermatologist who practices in Beverly Hills, Calif., who was asked to comment on the work, said that the study contributes insights into physician-AI interaction and highlights the need for further training on diagnosing skin diseases in people with darker skin tones. “The strengths of this study include its large sample size of dermatologists and primary care physicians, use of quality-controlled images across skin tones, and thorough evaluation of diagnostic accuracy with and without AI assistance,” said Dr. Moy, who is a past president of the American Academy of Dermatology, the American Society for Dermatologic Surgery, and the American Board of Facial Cosmetic Surgery.

“The study is limited to diagnosis and skin tone estimation based purely on a single image, which does not fully represent a clinical evaluation,” he added. However, “it does provide important benchmark data on diagnostic accuracy disparities across skin tones, but also demonstrates that while AI assistance can improve overall diagnostic accuracy, it may exacerbate disparities for non-specialists.”

Funding for the study was provided by MIT Media Lab consortium members and the Harold Horowitz Student Research Fund. One of the study authors, P. Murali Doraiswamy, MBBS, disclosed that he has received grants, advisory fees, and/or stock from several biotechnology companies outside the scope of this work and that he is a co-inventor on several patents through Duke University. The remaining authors reported having no disclosures. Dr. Moy reported having no disclosures.

Publications
Topics
Sections

When clinicians in a large-scale study viewed a series of digital images that showed skin diseases across skin tones and were asked to make a diagnosis, the accuracy was 38% among dermatologists and 19% among primary care physicians (PCPs). But when decision support from a deep learning system (DLS) was introduced, diagnostic accuracy increased by 33% among dermatologists and 69% among PCPs, results from a multicenter study showed.

However, the researchers found that across all images, diseases in dark skin (Fitzpatrick skin types 5 and 6) were diagnosed less accurately than diseases in light skin (Fitzpatrick skin types 1-4).

These results contribute to an emerging literature on diagnostic accuracy disparities across patient skin tones and present evidence that the diagnostic accuracy of medical professionals on images of dark skin is lower than on images of light skin,” researchers led by Matthew Groh, PhD, of Northwestern University’s Kellogg School of Management, wrote in their study, published online in Nature Medicine.



For the study, 389 board-certified dermatologists and 450 PCPs in 39 countries were presented with 364 images to view spanning 46 skin diseases and asked to submit up to four differential diagnoses. Nearly 80% of the images were of 8 diseases: atopic dermatitis, cutaneous T-cell lymphoma (CTCL), dermatomyositis, lichen planus, Lyme disease, pityriasis rosea, pityriasis rubra pilaris, and secondary syphilis.

Dermatologists and PCPs achieved a diagnostic accuracy of 38% and 19%, respectively, but both groups of clinicians were 4 percentage points less accurate for diagnosis of images of dark skin as compared with light skin. With assistance from DLS decision support, diagnostic accuracy increased by 33% among dermatologists and 69% among primary care physicians. Among dermatologists, DLS support generally increased diagnostic accuracy evenly across skin tones. However, among PCPs, DLS support increased their diagnostic accuracy more in light skin tones than in dark ones.

In the survey component of the study, when the participants were asked, “Do you feel you received sufficient training for diagnosing skin diseases in patients with skin of color (non-white patients)?” 67% of all PCPs and 33% of all dermatologists responded no. “Furthermore, we have found differences in how often BCDs [board-certified dermatologists] and PCPs refer patients with light and dark skin for biopsy,” the authors wrote. “Specifically, for CTCL (a life-threatening disease), we found that both BCDs and PCPs report that they would refer patients for biopsy significantly more often in light skin than dark skin. Moreover, for the common skin diseases atopic dermatitis and pityriasis rosea, we found that BCDs report they would refer patients for biopsy more often in dark skin than light skin, which creates an unnecessary overburden on patients with dark skin.”

In a press release about the study, Dr. Groh emphasized that he and other scientists who investigate human-computer interaction “have to find a way to incorporate underrepresented demographics in our research. That way we will be ready to accurately implement these models in the real world and build AI systems that serve as tools that are designed to avoid the kind of systematic errors we know humans and machines are prone to. Then you can update curricula, you can change norms in different fields and hopefully everyone gets better.”

Dr. Ronald Moy


Ronald Moy, MD, a dermatologist who practices in Beverly Hills, Calif., who was asked to comment on the work, said that the study contributes insights into physician-AI interaction and highlights the need for further training on diagnosing skin diseases in people with darker skin tones. “The strengths of this study include its large sample size of dermatologists and primary care physicians, use of quality-controlled images across skin tones, and thorough evaluation of diagnostic accuracy with and without AI assistance,” said Dr. Moy, who is a past president of the American Academy of Dermatology, the American Society for Dermatologic Surgery, and the American Board of Facial Cosmetic Surgery.

“The study is limited to diagnosis and skin tone estimation based purely on a single image, which does not fully represent a clinical evaluation,” he added. However, “it does provide important benchmark data on diagnostic accuracy disparities across skin tones, but also demonstrates that while AI assistance can improve overall diagnostic accuracy, it may exacerbate disparities for non-specialists.”

Funding for the study was provided by MIT Media Lab consortium members and the Harold Horowitz Student Research Fund. One of the study authors, P. Murali Doraiswamy, MBBS, disclosed that he has received grants, advisory fees, and/or stock from several biotechnology companies outside the scope of this work and that he is a co-inventor on several patents through Duke University. The remaining authors reported having no disclosures. Dr. Moy reported having no disclosures.

When clinicians in a large-scale study viewed a series of digital images that showed skin diseases across skin tones and were asked to make a diagnosis, the accuracy was 38% among dermatologists and 19% among primary care physicians (PCPs). But when decision support from a deep learning system (DLS) was introduced, diagnostic accuracy increased by 33% among dermatologists and 69% among PCPs, results from a multicenter study showed.

However, the researchers found that across all images, diseases in dark skin (Fitzpatrick skin types 5 and 6) were diagnosed less accurately than diseases in light skin (Fitzpatrick skin types 1-4).

These results contribute to an emerging literature on diagnostic accuracy disparities across patient skin tones and present evidence that the diagnostic accuracy of medical professionals on images of dark skin is lower than on images of light skin,” researchers led by Matthew Groh, PhD, of Northwestern University’s Kellogg School of Management, wrote in their study, published online in Nature Medicine.



For the study, 389 board-certified dermatologists and 450 PCPs in 39 countries were presented with 364 images to view spanning 46 skin diseases and asked to submit up to four differential diagnoses. Nearly 80% of the images were of 8 diseases: atopic dermatitis, cutaneous T-cell lymphoma (CTCL), dermatomyositis, lichen planus, Lyme disease, pityriasis rosea, pityriasis rubra pilaris, and secondary syphilis.

Dermatologists and PCPs achieved a diagnostic accuracy of 38% and 19%, respectively, but both groups of clinicians were 4 percentage points less accurate for diagnosis of images of dark skin as compared with light skin. With assistance from DLS decision support, diagnostic accuracy increased by 33% among dermatologists and 69% among primary care physicians. Among dermatologists, DLS support generally increased diagnostic accuracy evenly across skin tones. However, among PCPs, DLS support increased their diagnostic accuracy more in light skin tones than in dark ones.

In the survey component of the study, when the participants were asked, “Do you feel you received sufficient training for diagnosing skin diseases in patients with skin of color (non-white patients)?” 67% of all PCPs and 33% of all dermatologists responded no. “Furthermore, we have found differences in how often BCDs [board-certified dermatologists] and PCPs refer patients with light and dark skin for biopsy,” the authors wrote. “Specifically, for CTCL (a life-threatening disease), we found that both BCDs and PCPs report that they would refer patients for biopsy significantly more often in light skin than dark skin. Moreover, for the common skin diseases atopic dermatitis and pityriasis rosea, we found that BCDs report they would refer patients for biopsy more often in dark skin than light skin, which creates an unnecessary overburden on patients with dark skin.”

In a press release about the study, Dr. Groh emphasized that he and other scientists who investigate human-computer interaction “have to find a way to incorporate underrepresented demographics in our research. That way we will be ready to accurately implement these models in the real world and build AI systems that serve as tools that are designed to avoid the kind of systematic errors we know humans and machines are prone to. Then you can update curricula, you can change norms in different fields and hopefully everyone gets better.”

Dr. Ronald Moy


Ronald Moy, MD, a dermatologist who practices in Beverly Hills, Calif., who was asked to comment on the work, said that the study contributes insights into physician-AI interaction and highlights the need for further training on diagnosing skin diseases in people with darker skin tones. “The strengths of this study include its large sample size of dermatologists and primary care physicians, use of quality-controlled images across skin tones, and thorough evaluation of diagnostic accuracy with and without AI assistance,” said Dr. Moy, who is a past president of the American Academy of Dermatology, the American Society for Dermatologic Surgery, and the American Board of Facial Cosmetic Surgery.

“The study is limited to diagnosis and skin tone estimation based purely on a single image, which does not fully represent a clinical evaluation,” he added. However, “it does provide important benchmark data on diagnostic accuracy disparities across skin tones, but also demonstrates that while AI assistance can improve overall diagnostic accuracy, it may exacerbate disparities for non-specialists.”

Funding for the study was provided by MIT Media Lab consortium members and the Harold Horowitz Student Research Fund. One of the study authors, P. Murali Doraiswamy, MBBS, disclosed that he has received grants, advisory fees, and/or stock from several biotechnology companies outside the scope of this work and that he is a co-inventor on several patents through Duke University. The remaining authors reported having no disclosures. Dr. Moy reported having no disclosures.

Publications
Publications
Topics
Article Type
Sections
Article Source

FROM NATURE MEDICINE

Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Medscape Article
Display survey writer
Reuters content
Disable Inline Native ads
WebMD Article