The system, which requires no human intervention or prompting after deployment, achieved 98% specificity in real-world validation testing. Results are published in npj Digital Medicine.
Alongside the publication, the team is releasing Pythia, an open-source tool that enables any health care system or research institution to deploy autonomous prompt optimization for their own AI screening applications.
“We didn’t build a single AI model—we built a digital clinical team,” said corresponding author Hossein Estiri, Ph.D., director of the Clinical Augmented Intelligence (CLAI) research group and associate professor of medicine at Massachusetts General Hospital. “This AI system includes five specialized agents that critique each other and refine their reasoning, just like clinicians would in a case conference.”