AI May Aid in Choosing Right Doctor

University of Texas at Austin

Years ago, as she sat in waiting rooms, Maytal Saar-Tsechansky began to wonder how people chose a good doctor when they had no way of knowing a doctor's track record on accurate diagnoses. Talking to other patients, she found they sometimes based choices on a physician's personality or even the quality of their office furniture.

"I realized all these signals people are using are just not the right ones," says Saar-Tsechansky, professor of information, risk, and operations management at Texas McCombs. "We were operating in complete darkness, like there's no transparency on these things."

In new research, she uses artificial intelligence to judge the judges: to evaluate the rates at which experts make successful decisions. Her machine learning algorithm can appraise both doctors and other kinds of experts — such as engineers who diagnose mechanical problems — when their success rates are not publicly available or not scrutinized beyond small groups of peers.

Prior research has studied how accurate doctors' diagnoses are, but not in ways that can be scaled up or monitored on an ongoing basis, Saar-Tsechansky says.

More effective methods are vital today, she adds, when medical systems are deploying AI to help with diagnoses. It will be difficult to determine whether AI is helping or hurting successful diagnoses if observers can't tell how successful a doctor was without the AI assist.

Evaluating the Experts

With McCombs doctoral student Wanxue Dong and Tomer Geva of Tel Aviv University in Israel, Saar-Tsechansky created an algorithm they call MDE-HYB. It integrates two forms of information: overall data about the quality of an expert's past decisions and more detailed evaluations of specific cases.

They then compared MDE-HYB's results with other kinds of evaluators: three alternative algorithms and 40 human reviewers. To test the flexibility of MDE-HYB's ratings, three very different kinds of data were analyzed: sales tax audits, spam, and online movie reviews on IMDb.

In each case, evaluators judged prior decisions made by experts about the data: such as whether they accurately classified movie reviews as positive or negative. For all three sets, MDE-HYB equaled or bested all challengers.

  • Against other algorithms, its error rates were up to 95% lower.
  • Against humans, they were up to 72% lower.

The researchers also tested MDE-HYB on Saar-Tsechansky's original concern: selecting a doctor based on the doctor's history of correct diagnoses. Compared with doctors chosen by another algorithm, MDE-HYB dropped the average misdiagnosis rate by 41%.

In real-world use, such a difference could translate to better patient outcomes and lower costs, she says.

She cautions that MDE-HYB needs more work before putting it to such practical uses. "The main purpose of this paper was to get this idea out there, to get people to think about it, and hopefully people will improve this method," she says.

But she hopes it can one day help managers and regulators monitor expert workers' accuracy and decide when to intervene, if improvement is needed. Also, it might help consumers choose service providers such as doctors.

"In every profession where people make these types of decisions, it would be valuable to assess the quality of decision-making," Saar-Tsechansky says. "I don't think that any of us should be off the hook, especially if we make consequential decisions."

" A Machine Learning Framework for Assessing Experts' Decision Quality " is published in Management Science.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.