Tech & Health

September 5, 2025

AI tools could shorten ‘diagnostic odyssey’ for patients with rare diseases

Large language models achieved diagnostic rates of 13.3% and 10.0%, compared to the historical clinical review rate of 5.6%, and they suggested next steps to evaluate the suggested diagnoses.

Artificial intelligence tools successfully identified diagnoses for patients in the Undiagnosed Diseases Network (UDN), according to a study published Aug. 22 in the journal JAMA Network Open

Cathy Shyr, PhD

“Large language models have shown strong diagnostic performance on expert-curated case challenges, but their ability to assist in rare disease diagnosis is underexplored,” said Cathy Shyr, PhD, assistant professor of Biomedical Informatics, Pediatrics and Biostatistics at Vanderbilt University Medical Center and the first author of the study. 

Large language models (LLMs) are generative AI tools that are trained on large amounts of data to understand and generate natural language. In the current study, the researchers assessed whether LLMs can identify the final diagnosis for UDN patients based on available clinical information and compared the LLM results to historical clinical review. 

VUMC is an original member of the National Institutes of Health-supported UDN, which was established in 2014 to improve diagnosis and care of patients with undiagnosed diseases. In 2023, a transformative gift from the Potocsnak family established the Potocsnak Center for Undiagnosed and Rare Disorders at VUMC. 

Rizwan Hamid, MD, PhD

“Patients referred to the UDN and the Potocsnak Center are among the most challenging to diagnose,” said the study’s corresponding author, Rizwan Hamid, MD, PhD, the Dorothy Overall Wells Professor of Pediatrics and director of the Potocsnak Center. “In some cases, the ‘diagnostic odyssey’ for patients with rare disorders — the time from their symptom onset until they get a diagnosis — lasts for more than 10 years.” 

The current study included 90 VUMC UDN cases that were diagnosed between November 2016 and April 2024. The median age at symptom onset was 7.6 months, and the median length of diagnostic odyssey was 7.6 years. 

For each patient, the researchers prompted two LLMs (ChatGPT version 4o and Llama 3.1 8B) to generate a differential diagnosis — a list of possible conditions consistent with the patient’s clinical summary. The clinical summary is a standardized UDN intake document for diagnostic evaluation; it summarizes the patient’s presentation, including clinical history, family history and prior evaluations. 

Thomas Cassini, MD, assistant professor of Pediatrics and associate director of the Potocsnak Center, and Rory Tinker, MD, pediatrics resident, scored the LLM-generated differentials for inclusion of the exact, final diagnosis and closely related diagnoses. Kevin Byram, MD, associate professor of Medicine, resolved any scoring disagreements. 

The LLMs achieved diagnostic rates of 13.3% (ChatGPT) and 10.0% (Llama), compared to the historical clinical review rate of 5.6%. They provided helpful diagnoses for 23.3% (ChatGPT) and 16.7% (Llama) of cases. The LLMs also suggested next steps to evaluate the suggested diagnoses. Cost and processing time per case were $0.03 and five seconds for ChatGPT and $0 and 120 seconds for Llama. 

The findings suggest that LLMs can assist clinicians by generating an initial differential and facilitating the downstream workup. 

“Our study is the first to assess LLM diagnostic performance in the UDN,” Shyr said. “It adds to a growing body of insights about the clinical applicability of LLMs.” 

The researchers note that prospective studies are needed to further assess the clinical impact of LLMs. 

“These AI tools have the potential to shorten the diagnostic odyssey for patients with undiagnosed and rare disorders,” Hamid said. 

Other authors of the study are Peter Embí, MD, Lisa Bastarache, MS, and Josh Peterson, MD, from VUMC, and Hua Xu, PhD, from Yale School of Medicine. The research was supported by the National Institutes of Health (clinical research protocol 15-HG-0130; grants U01NS134349 and K99LM014429) and the Potocsnak Center for Undiagnosed and Rare Disorders.