ITHACA, N.Y. – A new Cornell University-led study suggests that if artificial intelligence tools can counsel a doctor like a colleague – pointing out relevant research that supports the decision – then doctors can better weigh the merits of the recommendation.
The researchers will present the new study in April at the Association for Computing Machinery CHI Conference on Human Factors in Computing Systems.
Previously, most AI researchers have tried to help doctors evaluate suggestions from decision support tools by explaining how the underlying algorithm works, or what data was used to train the AI. But an education in how AI makes its predictions wasn't sufficient, said Qian Yang, lead author and assistant professor of information science at Cornell University. Many doctors wanted to know if the tool had been validated in clinical trials, which typically does not happen with these tools.
"A doctor's primary job is not to learn how AI works," Yang said. "If we can build systems that help validate AI suggestions based on clinical trial results and journal articles, which are trustworthy information for doctors, then we can help them understand whether the AI is likely to be right or wrong for each specific case."
To develop this system, the researchers first interviewed nine doctors across a range of specialties, and three clinical librarians. They discovered that when doctors disagree on the right course of action, they track down results from relevant biomedical research and case studies, taking into account the quality of each study and how closely it applies to the case at hand.
Yang and her colleagues built a prototype of their clinical decision tool that mimics this process by presenting biomedical evidence alongside the AI's recommendation. They used GPT-3 to find and summarize relevant research.
The interface for the decision support tool lists patient information, medical history and lab test results on one side, with the AI's personalized diagnosis or treatment suggestion on the other, followed by relevant biomedical studies. Researchers also added a short summary for each study, so doctors can quickly absorb the most important information.
In interviews, doctors said they appreciated the clinical evidence, finding it intuitive and easy to understand, and preferred it to an explanation of the AI's inner workings.
"It's a highly generalizable method," Yang said. This type of approach could work for all medical specialties and other applications where scientific evidence is needed, such as Q&A platforms to answer patient questions or even automated fact checking of health-related news stories. "I would hope to see it embedded in different kinds of AI systems that are being developed, so we can make them useful for clinical practice."