GI Docs Will Need to Forge a ‘Human-Computer Cooperative’

Article Type

Changed

Sun, 10/13/2024 - 22:52

Author(s)

Several artificial intelligence (AI) technologies are emerging that will change the management of gastrointestinal (GI) diseases sooner rather than later. One of the leading researchers working toward that AI-driven future is Ryan W. Stidham, MD, MS, AGAF, associate professor of gastroenterology and computational medicine and bioinformatics at the University of Michigan, Ann Arbor.

Stidham’s work focuses on leveraging AI to develop automated systems that better quantify disease activity and aid gastroenterologists in their decision-making. He also serves as a meber of AGA's AI Task Force. He spoke with this news organization about his efforts to shape AI into a tool with practical applications in gastroenterology, what the technology may do to improve physician efficiency, and why gastroenterologists shouldn’t be worried about being replaced by machines any time soon.

How did you first become involved in studying AI applications for GI conditions?

My medical training coincided with the emergence of electronic health records (EHRs) making enormous amounts of data, ranging from laboratory results to diagnostic codes and billing records, readily accessible.

I quickly contracted data analytics fever, but a major problem became apparent: EHRs and medical claims data alone only weakly describe a patient. Researchers in the field were excited to use machine learning for personalizing treatment decisions for GI conditions, including inflammatory bowel disease (IBD). But no matter how large the dataset, the EHRs lacked the most rudimentary descriptions: What was the patient’s IBD phenotype? Where exactly was the disease located?

I could see machine learning had the potential to learn and reproduce expert decision-making. Unfortunately, we were fueling this machine-learning rocket ship with crude data unlikely to take us very far. Gastroenterologists rely on data in progress notes, emails, interpretations of colonoscopies, and radiologists’ and pathologists’ reviews of imaging to make treatment decisions, but that information is not well organized in any dataset.

I wanted to use AI to retrieve that key information in text, images, and video that we use every day for IBD care, automatically interpreting the data like a seasoned gastroenterologist. Generating higher-quality data describing patients could take our AI models from interesting research to useful and reliable tools in clinical care.

How did your early research go about trying to solve that problem?

My GI career began amid the IBD field shifting from relying on symptoms alone to objective biomarkers for IBD assessment, particularly focusing on standardized scoring of endoscopic mucosal inflammation. However, these scores were challenged with interobserver variability, prompting the need for centralized reading. More importantly, these scores are qualitative and do not capture all the visual findings an experienced physician appreciates when assessing severity, phenotype, and therapeutic effect. As a result, even experts could disagree on the degree of endoscopic severity, and patients with obvious differences in the appearance of mucosa could have the same endoscopic score.

I asked myself: Are we really using these measures to make treatment decisions and determine the effectiveness of investigational therapies? I thought we could do better and aimed to improve endoscopic IBD assessments using then-emerging digital image analysis techniques.

Convolutional neural network (CNN) modeling was just becoming feasible as computing performance increased. CNNs are well suited for complex medical image interpretation, using an associated “label,” such as the presence or grade of disease, to decipher the complex set of image feature patterns characterizing an expert’s determination of disease severity.

How did you convert the promise of CNN into tangible results?

The plan was simple: Collect endoscopic images from patients with IBD, find some experts to grade IBD severity on the images, and train a CNN model using the images and expert labels.

In 2016, developing a CNN wasn’t easy. There was no database of endoscopic images or simple methods for image labeling. The CNN needed tens of thousands of images. How were we to collect enough images with a broad range of IBD severity? I also reached some technical limits and needed help solving computational challenges.

Designing our first IBD endoscopic CNN took years of reading, coursework, additional training, and a new host of collaborators.

Failure was frequent, and my colleagues and I spent a lot of nights and weekends looking at thousands of individual endoscopic images. But we eventually had a working model for grading endoscopic severity, and its performance exceeded our expectations.

To our surprise, the CNN model grading of ulcerative colitis severity almost perfectly matched the opinion of IBD experts. We introduced the proof of concept that AI could automate complex disease measurement for IBD.

What took us 3 years in 2016 would take about 3 weeks today.

You have said that AI could help reduce the substantial administrative burdens in medicine today. What might an AI-assisted future look like for time-strapped gastroenterologists?

We will be spending more time on complex decision-making and developing treatment plans, with less time needed to hunt for information in the chart and administrative tasks.

The practical applications of AI will chip away at tedious mechanical tasks, soon to be done by machines, reclaiming time for gastroenterologists.

For example, automated documentation is almost usable, and audio recordings in the clinic could be leveraged to generate office notes.

Computer vision analysis of endoscopic video is generating draft procedural notes and letters to patients in a shared language, as well as recommending surveillance intervals based on the findings.

Text processing is already being used to automate billing and manage health maintenance like vaccinations, laboratory screening, and therapeutic drug monitoring.

Unfortunately, I don’t think that AI will immediately help with burnout. These near-term AI administrative assistant advantages, however, will help us manage the increasing patient load, address physician shortages, and potentially improve access to care in underserved areas.

Were there any surprises in your work?

I must admit, I was certain AI would put us gastroenterologists to shame. Over time, I have reversed that view.

AI really struggles to understand the holistic patient context when interpreting disease and predicting what to do for an individual patient. Humans anticipate gaps in data and customize the weighting of information when making decisions for individuals. An experienced gastroenterologist can incorporate risks, harms, and costs in ways AI is several generations from achieving.

With certainty, AI will outperform gastroenterologists for tedious and repetitive tasks, and we should gladly expect AI to assume those responsibilities. However, many unknowns remain in the daily management of GI conditions. We will continue to rely on the clinical experience, creativity, and improvisation of gastroenterologists for years to come.

Has there been a turning-point moment when it felt like this technology moved from being more theoretical to something with real-world clinical applications?

Last spring, I saw a lecture by Peter Lee, who is president of Microsoft Research and a leader in developing AI-powered applications in medicine and scientific research, demonstrating how a large language model (LLM) could “understand” medical text and generate responses to questions. My jaw dropped.

We watched an LLM answer American Board of Internal Medicine questions with perfect explanations and rationale. He demonstrated how an audio recording of a clinic visit could be used to automatically generate a SOAP (subjective, objective assessment and plan) note. It was better than anything I would have drafted. He also showed how the LLM could directly ingest EHR data, without any modification, and provide a great diagnosis and treatment plan. Finally, LLM chatbots could carry on an interactive conversation with a patient that would be difficult to distinguish from a human physician.

The inevitability of AI-powered transformations in gastroenterology care became apparent.

Documentation, billing, and administrative work will be handled by AI. AI will collect and organize information for me. Chart reviews and even telephone/email checkups on patients will be a thing of the past. AI chatbots will be able to discuss an individual patient’s condition and test results. Our GI-AI assistants will proactively collect information from patients after hospitalization or react to a change in labs.

AI will soon be an amazing diagnostician and will know more than me. So do we need to polish our resumes for new careers? No, but we will need to adapt to changes, which I believe on the whole will be better for gastroenterologists and patients.

What does adaptation look like for gastroenterologists over the next handful of years?

Like any other tool, gastroenterologists will be figuring out how to use AI prediction models, chatbots, and imaging analytics. Value, ease of use, and information-gain will drive which AI tools are ultimately adopted.

Memory, information recall, calculations, and repetitive tasks where gastroenterologists occasionally error or find tiresome will become the job of machines. We will still be the magicians, now aided by machines, applying our human strengths of contextual awareness, judgment, and creativity to find customized solutions for more patients.

That, I think, is the future that we are reliably moving toward over the next decade — a human-computer cooperative throughout gastroenterology (including IBD) and, frankly, all of medicine.

A version of this article appeared on Medscape.com.

Publications

GI and Hepatology News

Internal Medicine News

MDedge Internal Medicine

Topics

IBD & Intestinal Disorders

Gastroenterology

Sections

Feature

How did you first become involved in studying AI applications for GI conditions?

My medical training coincided with the emergence of electronic health records (EHRs) making enormous amounts of data, ranging from laboratory results to diagnostic codes and billing records, readily accessible.

I quickly contracted data analytics fever, but a major problem became apparent: EHRs and medical claims data alone only weakly describe a patient. Researchers in the field were excited to use machine learning for personalizing treatment decisions for GI conditions, including inflammatory bowel disease (IBD). But no matter how large the dataset, the EHRs lacked the most rudimentary descriptions: What was the patient’s IBD phenotype? Where exactly was the disease located?

I could see machine learning had the potential to learn and reproduce expert decision-making. Unfortunately, we were fueling this machine-learning rocket ship with crude data unlikely to take us very far. Gastroenterologists rely on data in progress notes, emails, interpretations of colonoscopies, and radiologists’ and pathologists’ reviews of imaging to make treatment decisions, but that information is not well organized in any dataset.

I wanted to use AI to retrieve that key information in text, images, and video that we use every day for IBD care, automatically interpreting the data like a seasoned gastroenterologist. Generating higher-quality data describing patients could take our AI models from interesting research to useful and reliable tools in clinical care.

How did your early research go about trying to solve that problem?

My GI career began amid the IBD field shifting from relying on symptoms alone to objective biomarkers for IBD assessment, particularly focusing on standardized scoring of endoscopic mucosal inflammation. However, these scores were challenged with interobserver variability, prompting the need for centralized reading. More importantly, these scores are qualitative and do not capture all the visual findings an experienced physician appreciates when assessing severity, phenotype, and therapeutic effect. As a result, even experts could disagree on the degree of endoscopic severity, and patients with obvious differences in the appearance of mucosa could have the same endoscopic score.

I asked myself: Are we really using these measures to make treatment decisions and determine the effectiveness of investigational therapies? I thought we could do better and aimed to improve endoscopic IBD assessments using then-emerging digital image analysis techniques.

Convolutional neural network (CNN) modeling was just becoming feasible as computing performance increased. CNNs are well suited for complex medical image interpretation, using an associated “label,” such as the presence or grade of disease, to decipher the complex set of image feature patterns characterizing an expert’s determination of disease severity.

How did you convert the promise of CNN into tangible results?

The plan was simple: Collect endoscopic images from patients with IBD, find some experts to grade IBD severity on the images, and train a CNN model using the images and expert labels.

In 2016, developing a CNN wasn’t easy. There was no database of endoscopic images or simple methods for image labeling. The CNN needed tens of thousands of images. How were we to collect enough images with a broad range of IBD severity? I also reached some technical limits and needed help solving computational challenges.

Designing our first IBD endoscopic CNN took years of reading, coursework, additional training, and a new host of collaborators.

Failure was frequent, and my colleagues and I spent a lot of nights and weekends looking at thousands of individual endoscopic images. But we eventually had a working model for grading endoscopic severity, and its performance exceeded our expectations.

To our surprise, the CNN model grading of ulcerative colitis severity almost perfectly matched the opinion of IBD experts. We introduced the proof of concept that AI could automate complex disease measurement for IBD.

What took us 3 years in 2016 would take about 3 weeks today.

You have said that AI could help reduce the substantial administrative burdens in medicine today. What might an AI-assisted future look like for time-strapped gastroenterologists?

We will be spending more time on complex decision-making and developing treatment plans, with less time needed to hunt for information in the chart and administrative tasks.

The practical applications of AI will chip away at tedious mechanical tasks, soon to be done by machines, reclaiming time for gastroenterologists.

For example, automated documentation is almost usable, and audio recordings in the clinic could be leveraged to generate office notes.

Computer vision analysis of endoscopic video is generating draft procedural notes and letters to patients in a shared language, as well as recommending surveillance intervals based on the findings.

Text processing is already being used to automate billing and manage health maintenance like vaccinations, laboratory screening, and therapeutic drug monitoring.

Unfortunately, I don’t think that AI will immediately help with burnout. These near-term AI administrative assistant advantages, however, will help us manage the increasing patient load, address physician shortages, and potentially improve access to care in underserved areas.

Were there any surprises in your work?

I must admit, I was certain AI would put us gastroenterologists to shame. Over time, I have reversed that view.

AI really struggles to understand the holistic patient context when interpreting disease and predicting what to do for an individual patient. Humans anticipate gaps in data and customize the weighting of information when making decisions for individuals. An experienced gastroenterologist can incorporate risks, harms, and costs in ways AI is several generations from achieving.

With certainty, AI will outperform gastroenterologists for tedious and repetitive tasks, and we should gladly expect AI to assume those responsibilities. However, many unknowns remain in the daily management of GI conditions. We will continue to rely on the clinical experience, creativity, and improvisation of gastroenterologists for years to come.

Has there been a turning-point moment when it felt like this technology moved from being more theoretical to something with real-world clinical applications?

Last spring, I saw a lecture by Peter Lee, who is president of Microsoft Research and a leader in developing AI-powered applications in medicine and scientific research, demonstrating how a large language model (LLM) could “understand” medical text and generate responses to questions. My jaw dropped.

We watched an LLM answer American Board of Internal Medicine questions with perfect explanations and rationale. He demonstrated how an audio recording of a clinic visit could be used to automatically generate a SOAP (subjective, objective assessment and plan) note. It was better than anything I would have drafted. He also showed how the LLM could directly ingest EHR data, without any modification, and provide a great diagnosis and treatment plan. Finally, LLM chatbots could carry on an interactive conversation with a patient that would be difficult to distinguish from a human physician.

The inevitability of AI-powered transformations in gastroenterology care became apparent.

Documentation, billing, and administrative work will be handled by AI. AI will collect and organize information for me. Chart reviews and even telephone/email checkups on patients will be a thing of the past. AI chatbots will be able to discuss an individual patient’s condition and test results. Our GI-AI assistants will proactively collect information from patients after hospitalization or react to a change in labs.

AI will soon be an amazing diagnostician and will know more than me. So do we need to polish our resumes for new careers? No, but we will need to adapt to changes, which I believe on the whole will be better for gastroenterologists and patients.

What does adaptation look like for gastroenterologists over the next handful of years?

Like any other tool, gastroenterologists will be figuring out how to use AI prediction models, chatbots, and imaging analytics. Value, ease of use, and information-gain will drive which AI tools are ultimately adopted.

Memory, information recall, calculations, and repetitive tasks where gastroenterologists occasionally error or find tiresome will become the job of machines. We will still be the magicians, now aided by machines, applying our human strengths of contextual awareness, judgment, and creativity to find customized solutions for more patients.

That, I think, is the future that we are reliably moving toward over the next decade — a human-computer cooperative throughout gastroenterology (including IBD) and, frankly, all of medicine.

A version of this article appeared on Medscape.com.

Several artificial intelligence (AI) technologies are emerging that will change the management of gastrointestinal (GI) diseases sooner rather than later. One of the leading researchers working toward that AI-driven future is Ryan W. Stidham, MD, MS, AGAF, associate professor of gastroenterology and computational medicine and bioinformatics at the University of Michigan, Ann Arbor.

Stidham’s work focuses on leveraging AI to develop automated systems that better quantify disease activity and aid gastroenterologists in their decision-making. He also serves as a meber of AGA's AI Task Force. He spoke with this news organization about his efforts to shape AI into a tool with practical applications in gastroenterology, what the technology may do to improve physician efficiency, and why gastroenterologists shouldn’t be worried about being replaced by machines any time soon.

How did you first become involved in studying AI applications for GI conditions?

My medical training coincided with the emergence of electronic health records (EHRs) making enormous amounts of data, ranging from laboratory results to diagnostic codes and billing records, readily accessible.

I quickly contracted data analytics fever, but a major problem became apparent: EHRs and medical claims data alone only weakly describe a patient. Researchers in the field were excited to use machine learning for personalizing treatment decisions for GI conditions, including inflammatory bowel disease (IBD). But no matter how large the dataset, the EHRs lacked the most rudimentary descriptions: What was the patient’s IBD phenotype? Where exactly was the disease located?

I could see machine learning had the potential to learn and reproduce expert decision-making. Unfortunately, we were fueling this machine-learning rocket ship with crude data unlikely to take us very far. Gastroenterologists rely on data in progress notes, emails, interpretations of colonoscopies, and radiologists’ and pathologists’ reviews of imaging to make treatment decisions, but that information is not well organized in any dataset.

I wanted to use AI to retrieve that key information in text, images, and video that we use every day for IBD care, automatically interpreting the data like a seasoned gastroenterologist. Generating higher-quality data describing patients could take our AI models from interesting research to useful and reliable tools in clinical care.

How did your early research go about trying to solve that problem?

My GI career began amid the IBD field shifting from relying on symptoms alone to objective biomarkers for IBD assessment, particularly focusing on standardized scoring of endoscopic mucosal inflammation. However, these scores were challenged with interobserver variability, prompting the need for centralized reading. More importantly, these scores are qualitative and do not capture all the visual findings an experienced physician appreciates when assessing severity, phenotype, and therapeutic effect. As a result, even experts could disagree on the degree of endoscopic severity, and patients with obvious differences in the appearance of mucosa could have the same endoscopic score.

I asked myself: Are we really using these measures to make treatment decisions and determine the effectiveness of investigational therapies? I thought we could do better and aimed to improve endoscopic IBD assessments using then-emerging digital image analysis techniques.

Convolutional neural network (CNN) modeling was just becoming feasible as computing performance increased. CNNs are well suited for complex medical image interpretation, using an associated “label,” such as the presence or grade of disease, to decipher the complex set of image feature patterns characterizing an expert’s determination of disease severity.

How did you convert the promise of CNN into tangible results?

The plan was simple: Collect endoscopic images from patients with IBD, find some experts to grade IBD severity on the images, and train a CNN model using the images and expert labels.

In 2016, developing a CNN wasn’t easy. There was no database of endoscopic images or simple methods for image labeling. The CNN needed tens of thousands of images. How were we to collect enough images with a broad range of IBD severity? I also reached some technical limits and needed help solving computational challenges.

Designing our first IBD endoscopic CNN took years of reading, coursework, additional training, and a new host of collaborators.

Failure was frequent, and my colleagues and I spent a lot of nights and weekends looking at thousands of individual endoscopic images. But we eventually had a working model for grading endoscopic severity, and its performance exceeded our expectations.

To our surprise, the CNN model grading of ulcerative colitis severity almost perfectly matched the opinion of IBD experts. We introduced the proof of concept that AI could automate complex disease measurement for IBD.

What took us 3 years in 2016 would take about 3 weeks today.

You have said that AI could help reduce the substantial administrative burdens in medicine today. What might an AI-assisted future look like for time-strapped gastroenterologists?

We will be spending more time on complex decision-making and developing treatment plans, with less time needed to hunt for information in the chart and administrative tasks.

The practical applications of AI will chip away at tedious mechanical tasks, soon to be done by machines, reclaiming time for gastroenterologists.

For example, automated documentation is almost usable, and audio recordings in the clinic could be leveraged to generate office notes.

Computer vision analysis of endoscopic video is generating draft procedural notes and letters to patients in a shared language, as well as recommending surveillance intervals based on the findings.

Text processing is already being used to automate billing and manage health maintenance like vaccinations, laboratory screening, and therapeutic drug monitoring.

Unfortunately, I don’t think that AI will immediately help with burnout. These near-term AI administrative assistant advantages, however, will help us manage the increasing patient load, address physician shortages, and potentially improve access to care in underserved areas.

Were there any surprises in your work?

I must admit, I was certain AI would put us gastroenterologists to shame. Over time, I have reversed that view.

AI really struggles to understand the holistic patient context when interpreting disease and predicting what to do for an individual patient. Humans anticipate gaps in data and customize the weighting of information when making decisions for individuals. An experienced gastroenterologist can incorporate risks, harms, and costs in ways AI is several generations from achieving.

With certainty, AI will outperform gastroenterologists for tedious and repetitive tasks, and we should gladly expect AI to assume those responsibilities. However, many unknowns remain in the daily management of GI conditions. We will continue to rely on the clinical experience, creativity, and improvisation of gastroenterologists for years to come.

Has there been a turning-point moment when it felt like this technology moved from being more theoretical to something with real-world clinical applications?

Last spring, I saw a lecture by Peter Lee, who is president of Microsoft Research and a leader in developing AI-powered applications in medicine and scientific research, demonstrating how a large language model (LLM) could “understand” medical text and generate responses to questions. My jaw dropped.

We watched an LLM answer American Board of Internal Medicine questions with perfect explanations and rationale. He demonstrated how an audio recording of a clinic visit could be used to automatically generate a SOAP (subjective, objective assessment and plan) note. It was better than anything I would have drafted. He also showed how the LLM could directly ingest EHR data, without any modification, and provide a great diagnosis and treatment plan. Finally, LLM chatbots could carry on an interactive conversation with a patient that would be difficult to distinguish from a human physician.

The inevitability of AI-powered transformations in gastroenterology care became apparent.

Documentation, billing, and administrative work will be handled by AI. AI will collect and organize information for me. Chart reviews and even telephone/email checkups on patients will be a thing of the past. AI chatbots will be able to discuss an individual patient’s condition and test results. Our GI-AI assistants will proactively collect information from patients after hospitalization or react to a change in labs.

AI will soon be an amazing diagnostician and will know more than me. So do we need to polish our resumes for new careers? No, but we will need to adapt to changes, which I believe on the whole will be better for gastroenterologists and patients.

What does adaptation look like for gastroenterologists over the next handful of years?

Like any other tool, gastroenterologists will be figuring out how to use AI prediction models, chatbots, and imaging analytics. Value, ease of use, and information-gain will drive which AI tools are ultimately adopted.

Memory, information recall, calculations, and repetitive tasks where gastroenterologists occasionally error or find tiresome will become the job of machines. We will still be the magicians, now aided by machines, applying our human strengths of contextual awareness, judgment, and creativity to find customized solutions for more patients.

That, I think, is the future that we are reliably moving toward over the next decade — a human-computer cooperative throughout gastroenterology (including IBD) and, frankly, all of medicine.

A version of this article appeared on Medscape.com.

Publications

GI and Hepatology News

Internal Medicine News

MDedge Internal Medicine

Publications

GI and Hepatology News

Internal Medicine News

MDedge Internal Medicine

Topics

IBD & Intestinal Disorders

Gastroenterology

Article Type

News

Sections

Feature

User login

How did you first become involved in studying AI applications for GI conditions?

How did your early research go about trying to solve that problem?

How did you convert the promise of CNN into tangible results?

You have said that AI could help reduce the substantial administrative burdens in medicine today. What might an AI-assisted future look like for time-strapped gastroenterologists?

Were there any surprises in your work?

Has there been a turning-point moment when it felt like this technology moved from being more theoretical to something with real-world clinical applications?

What does adaptation look like for gastroenterologists over the next handful of years?

How did you first become involved in studying AI applications for GI conditions?

How did your early research go about trying to solve that problem?

How did you convert the promise of CNN into tangible results?

You have said that AI could help reduce the substantial administrative burdens in medicine today. What might an AI-assisted future look like for time-strapped gastroenterologists?

Were there any surprises in your work?

Has there been a turning-point moment when it felt like this technology moved from being more theoretical to something with real-world clinical applications?

What does adaptation look like for gastroenterologists over the next handful of years?

How did you first become involved in studying AI applications for GI conditions?

How did your early research go about trying to solve that problem?

How did you convert the promise of CNN into tangible results?

You have said that AI could help reduce the substantial administrative burdens in medicine today. What might an AI-assisted future look like for time-strapped gastroenterologists?

Were there any surprises in your work?

Has there been a turning-point moment when it felt like this technology moved from being more theoretical to something with real-world clinical applications?

What does adaptation look like for gastroenterologists over the next handful of years?