Investigators at Mass General Brigham have developed an AI-based tool to sift through electronic health records to help clinicians identify cases of long COVID, an often mysterious condition that can encompass a litany of enduring symptoms , including fatigue, chronic cough, and brain fog after infection from SARS-CoV-2. The results, which are published in the journal Med , could identify more people who should be receiving care for this potentially debilitating condition. The number of cases they identified also suggests that the prevalence of long COVID could be greatly underrecognized.
"Our AI tool could turn a foggy diagnostic process into something sharp and focused, giving clinicians the power to make sense of a challenging condition," said senior author Hossein Estiri, PhD, head of AI Research at the Center for AI and Biomedical Informatics of the Learning Healthcare System (CAIBILS) at Mass General Brigham and an associate professor of Medicine at Harvard Medical School. "With this work, we may finally be able to see long COVID for what it truly is—and more importantly, how to treat it."
Long COVID, also known as Post-Acute Sequelae of SARS-CoV-2 infection (PASC), includes a wide range of symptoms. For the purposes of their study, Estiri and colleagues defined it as a diagnosis of exclusion that is also infection associated . That means the diagnosis could not be explained in the patient's unique medical record and it also had to associate with a COVID infection. In addition, the diagnosis needed to have persisted for 2 months or longer in a 12-month follow up window.
The algorithm used in the AI tool was developed by drawing de-identified patient data from the clinical records of nearly 300,000 patients across 14 hospitals and 20 community health centers in the Mass General Brigham system. Rather than having to rely on a single diagnosis code, the AI utilizes a novel method developed by Estiri and colleagues called "precision phenotyping" that sifts through individual records to identify symptoms and conditions linked to COVID-19 and to track symptoms over time in order to differentiate them from other illnesses. For example, the algorithm can detect if shortness of breath may be the result of pre-existing conditions like heart failure or asthma rather than a long COVID. Only when every other possibility was exhausted would the tool flag the patient as having long COVID.
"Physicians are often faced with having to wade through a tangled web of symptoms and medical histories, unsure of which threads to pull, while balancing busy caseloads. Having a tool powered by AI that can methodically do it for them could be a game-changer," said Alaleh Azhir, MD , the co-lead author who is an internal medicine resident at Brigham Women's Hospital, a founding member of the Mass General Brigham healthcare system.
The patient-centered diagnoses provided by this new method may also help alleviate biases built into current diagnostics for long COVID, according to the researchers, who note that patients diagnosed with the official ICD-10 diagnostic code for long COVID trend towards those with easier access to healthcare. While other diagnostic studies have suggested that approximately 7% of the population suffers from long COVID, this new approach reveals a much higher estimate—22.8%. The authors stated that this figure aligns more closely with national trends and paints a more realistic picture of the pandemic's long-term toll.
The researchers determined their tool was about 3 percent more accurate than what ICD-10 codes capture, while being less biased. Specifically, their study demonstrated that the individuals they identified as having long COVID mirror the broader demographic makeup of Massachusetts, unlike long COVID algorithms that rely on a single diagnostic code or individual clinical encounters, skewing results toward certain populations such as those with more access to care. "This broader scope ensures that marginalized communities, often sidelined in clinical studies, are no longer invisible," said Estiri.
Limitations of the study and AI tool include that health record data used in the algorithm to account for long COVID symptoms may be less complete than what is captured by physicians in post-visit clinical notes. Another limitation was the algorithm did not capture possible worsening of a prior condition, which may have been a long COVID symptom. For example, if a patient had COPD and prior episodes of it worsened before they developed COVID-19, the algorithm might have removed them even if their persisting symptoms were a long COVID indicator. Declines in the amount of COVID-19 testing in recent years also makes it difficult to identify when a patient may have first gotten COVID-19. The study was also limited to patients in Massachusetts.
Future studies may explore the algorithm in cohorts of patients with specific conditions, like COPD or diabetes. The researchers also plan to release this algorithm publicly on open access where physicians and healthcare systems globally can use it in their patient populations.
In addition to opening the door to better clinical care, this work may lay the foundation for future research into the genetic and biochemical factors behind long COVID's various subtypes. "Questions about the true burden of long COVID—questions that have thus far remained elusive—now seem more within reach," said Estiri.
Authorship: In addition to Estiri, Mass General Brigham authors include Alaleh Azhir, Jonas Hügel, Jiazi Tian, Jingya Cheng, Ingrid V. Bassett, Emily S. Lau, Yevgeniy R. Semenov, Virginia A. Triant, Zachary H. Strasser, Jeffrey G. Klann, and Shawn N. Murphy. Additional authors include, Douglas S. Bell, Elmer V. Bernstam, Maha R. Farhat, Darren W. Henderson, Michele Morris, and Shyam Visweswaran.
Disclosures: None.
Funding: Support from the National Institutes of Health, National Institute of Allergy and Infectious Diseases (NIAID) R01AI165535, National Heart, Lung, and Blood Institute (NHLBI) OT2HL161847, and National Center for Advancing Translational Sciences (NCATS) UL1 TR003167, UL1 TR001881, and U24TR004111. J.Hügel's work was partially funded by a fellowship within the IFI programme of the German Academic Exchange Service (DAAD) and by the Federal Ministry of Education and Research (BMBF) as well by the German Research Foundation (426671079).
Paper cited: Azhir A et al. "Precision Phenotyping for Curating Research Cohorts of Patients with Unexplained Post-Acute Sequelae of COVID-19" Med DOI: 10.1016/j.medj.2024.10.009