Explore This Issue
March 2024On a recent visit, my doctor used an app on his phone that documented our conversation in real time in a way that mimicked the appearance of text messaging. At the close of our conversation, the app immediately generated a summary of the clinical encounter. It was an example of a type of artificial intelligence (AI) that generates new content, aptly called generative AI.
Generative AI isn’t new. Early iterations of it emerged in the 1960s with chatbots, but its capability took off in 2014 with the introduction of a machine learning algorithm called generative adversarial networks (GANs) that allowed for the generation of content (text, sound, video, images) that was convincingly human. More recent advances show just how astonishing the capabilities of generative AI are with the development of large language models (LLMs) that have advanced the technology to include its most recent iterations of ChatGPT, Dall-E, and others.
This article provides a brief foundational understanding of generative AI and describes examples of its current and potential uses in healthcare, as well as the experience of some otolaryngologists in their early adoption of generative AI programs designed to help ease the burdensome task of documentation. The next article in the series will look more closely at the latest advancements in generative AI—ChatGPT, Dall-E, and other LLMs—that are rapidly being deployed by healthcare professionals, patients, and consumers in a number of ways.
Generative AI in Brief
Generative AI is a type of machine learning that creates new and novel content, such as text, music, images, video, and new computer code. What makes generative AI different from traditional AI is that generative AI makes predictions about content items in a sequence based on past data and relationships among items in past data, explained Thomas Davenport, PhD, the President’s Distinguished Professor of Information Technology at Babson College in Babson Park, Mass., a recognized thought leader who has written extensively on data analytics, big data, and AI.
“What this means in practice is that it makes recommendations, for example, for the next word in a sentence,” he said. For instance, he said, if the input is “Jack jumped over the [blank]”, generative AI would likely recommend “candlestick” as the next word based on what it has learned from content on the internet.
Like other forms of machine learning, Dr. Davenport said, generative AI makes predictions. But instead of making predictions on “structured” data, or data that are typically made up of rows and columns of numbers, generative AI makes predictions on “unstructured” data based on learning what the next most likely item in a sequence will be. In other words, it tries to generate unstructured content, he explained. (See the sidebar “Structured and Unstructured Data in Healthcare” below.)
Dr. Davenport called generative AI a game changer that’s closer to human intelligence than has been seen before with AI. “That’s the reason for the excitement,” he said.
Generative AI in Healthcare and Otolaryngology
The application of generative AI in healthcare is largely in its infancy. “There are a lot of exciting things happening in the research laboratory, but not much at the clinical bedside yet,” said Dr. Davenport. He cited a variety of barriers for full implementation, ranging from concerns over accuracy, to lack of transparency, to the technology not fitting in well with clinical workflow.
One application in which it is making inroads is in easing the daily drudge of documentation, an application described in this article’s opening scenario. Gregory Ator, MD, an associate professor of otolaryngology–head and neck surgery at the University of Kansas Medical Center in Kansas City who focuses on improving the clinician experience with complex technology, is among 250 physicians at the KU Medical Center who currently use the generative AI program called abridge (www.abridge.com/ai) for documenting clinical visits and generating clinical notes.
Survey results from July 2023 (the program was implemented at the KU Medical Center in April 2023) showed that 70% of users strongly agreed that the program decreased the stress of documentation, and that 70% strongly agreed that the program generated a faster clinical note. When asked about the accuracy of the clinical note, the users gave an overall score of 3.84 on a scale of 1–5, with 5 indicating a near-perfect clinical note requiring little or no editing.
Dr. Ator was quick to point out that use of the abridge program requires clinicians to review each clinical note in real time (preferably after each patient) to ensure accuracy and completeness. “It is absolutely necessary to review summaries, and the editing experience is much better if done in real time,” he said, explaining that delaying such reviews—say, until the end of the day—makes it more difficult to correct inaccuracies in each patient’s summary because of the volume of patients seen and the potential for documentation of similar clinical presentations documented. Clinicians are also encouraged to use additional tools to augment the program as needed, such as templates or smart phrases, or even plain-speech text like Dragon, he said.
The program, said Dr. Ator, is associated with a 21% reduction in the time it takes to create notes at KU Medical Center. “We now spend only about 10 minutes documenting per note, so that 20% reduction with this program is a big deal,” he said, emphasizing the benefit to doctors who can “die a thousand clicks” engaging in the amount of documentation required for clinical care. “We don’t want to click ourselves into oblivion,” he said. “This technology really reduces some low-level clicks and allows us to focus on the good stuff.”
Alfred-Marc Iloreta, Jr., MD, an assistant professor in artificial intelligence and emerging technologies in the Graduate School of Biomedical Sciences at the Icahn School of Medicine at Mount Sinai Hospital in New York City, where he is also an assistant professor of otolaryngology–head and neck surgery and neurosurgery and co-directs the endoscopic skull program, has been using ambient dictation for the past six to eight months, and said he didn’t realize how powerful it could be and how useful it could become.
Ambient dictation, said Dr. Iloreta, is a great use of clinical intelligence—an emerging field in healthcare that uses data analysis to improve care delivery. Deployed for commercial use, ambient clinical intelligence is a tool that, with minimal AI model training, can reliably document a free-flowing conversation between physicians and patients and their families. “It’s a great example of how we merge and leverage two well-developed technologies of speech recognition and natural language processing [NLP],” he said.
“I have personally found it incredibly helpful in getting documentation completed in a timely manner and, even more importantly, I find that I’m much more present during the patient visit since I can divert my short-term or working memory to the actual patient interaction instead of memorizing details,” Dr. Iloreta added. “Outsourcing those cognitive processes to AI unloads a significant cognitive burden and can enhance efficiency and decrease burnout.”
Dr. Davenport also cited this application as one of the main ways he hears people talking about the use of generative AI in healthcare. He cautioned, however, that generative AI makes mistakes (or “hallucinations” in AI terminology) and can make things up.
Other applications of generative AI on the horizon, said Dr. Davenport, include telemedicine. “Generative AI could generate more intelligent telemedicine, where it could review a patient’s electronic health record, hear a patient describe his/her symptoms, and then recommend an office visit if appropriate,” he said. “It [generative AI] has the language capabilities; it just needs a bit more knowledge.”
Dr. Davenport noted that, because of a shortage of doctors, China is already using an intelligent telemedicine app called Ping An Good Doctor for over 400 million patients to make initial diagnoses and to provide triage and treatment recommendations.
Education is another area in which generative AI may play a beneficial role. Patrick Scheffler, MD, a pediatric otolaryngologist at Phoenix Children’s Hospital, cited a study (PLOS Digit Health. 2023. doi.org/10.1371/journal.pdig.0000202) describing the use of generative AI to create otoscopy images (both healthy and unhealthy) for use in educating medical students for the board exams. Based on validation by a panel of otolaryngologists who rated the AI program as doing a good job in generating images for training purposes, Dr. Scheffler said the potential benefit of this type of program is to help ensure that medical students are offered new images not seen on past exams.
Dr. Scheffler, who runs an active research program in machine learning that focuses on image and sound classification, believes that once language-processing AI gets really good, health systems may start employing it for tasks such as conducting triage, checking symptoms, or telling patients if they need to make an appointment to see a specialist. He cited an AI tool called TriageGO (https://www.beckmancoulter.com/solutions/triagego) already in use at Johns Hopkins in Baltimore to assist with triage in the emergency room.
How Original is Generative AI?
Although generative AI may be closer to human intelligence than earlier forms of AI, it still relies on learning from previously created content to generate new content, and that can easily propagate any biased data found in the original content. This presents another reason for caution, particularly for future applications that may involve diagnoses, triage, and treatment recommendations.
“The term ‘original is somewhat up to debate because in order to create new content, generative AI models require training and learning from prior datasets,” said Dr. Iloreta.
This is not a small point. Rayid Ghani, PhD, a distinguished career professor in the machine learning department of the Heinz College of Information Systems and Public Policy at Carnegie Mellon University in Pittsburgh, called AI a potentially wonderful tool in healthcare, but added a caveat. He underscored that the information that generative AI uses to learn about patients with type 2 diabetes or chronic kidney/liver conditions, among others, will come from existing information and data sources such as clinical guidelines, research literature, and data generated by existing clinical care practices.
“Each of these data sources comes with its own issues,” he said. For example, he said that guidelines are typically overly simplistic in that they aren’t uniformly accurate for all populations of patients, typically those who aren’t White, and therefore some diseases like type 2 diabetes or chronic liver/kidney conditions are often underdiagnosed in minority populations. Using such existing data will continue to generate information that isn’t representative of all populations, he cautioned, and will lead to persistent underdiagnosis of certain conditions.
For Dr. Ghani, there is value in AI tools only if they’re developed correctly and deliberately to achieve outcomes that we care about. “If you think the old system is inequitable and biased and you want to use AI to improve equity and outcomes, then you have to design AI to deliberately do that,” he said.
Maya G. Sardesai, MD, MEd, an associate professor of otolaryngology–head and neck surgery and assistant dean for student development at UW School of Medicine in Seattle, sees AI as playing an important role in education and training (as will be discussed in article four in this series on extended/augmented reality), but also cautioned that generative AI may not be accurate, comprehensive, and representative of all patients. “There is a risk of perpetuating or propagating bias or having a disproportionate amount of incorrect information that feeds the analysis, and then what AI generates may not offer a fair or proportionate representation of what we’re trying to look at,” she said.
For otolaryngologists, this may come in the form of images of faces generated from what AI has learned through information on publicly available data sources. Such images may potentially be biased toward certain types of faces and may not represent a particular patient, a concern particularly for plastic surgeons in this case. “We need to be careful and strategic when using this generative AI,” said Dr. Sardesai.
Mary Beth Nierengarten is a freelance medical writer based in Minnesota.