Events Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
M
T
W
T
F
S
S
27
28
29
30
31
1
2
12:00 AM - NextGen UGM 2025
3
4
6
7
8
9
10
11
12
13
14
15
16
17
10:00 AM - MEDICA 2025
18
19
20
21
22
23
24
25
26
27
28
29
30
NextGen UGM 2025
2025-11-02 - 2025-11-05    
12:00 am
NextGen UGM 2025 is set to take place in Nashville, TN, from November 2 to 5 at the Gaylord Opryland Resort & Convention Center. This [...]
Preparing Healthcare Systems for Cyber Threats
2025-11-05    
2:00 pm
Healthcare is facing an unprecedented level of cyber risk. With cyberattacks on the rise, health systems must prepare for the reality of potential breaches. In [...]
MEDICA 2025
2025-11-17 - 2025-11-20    
10:00 am - 5:00 pm
Expert Exchange in Medicine at MEDICA – Shaping the Future of Healthcare MEDICA unites the key players driving innovation in medicine. Whether you're involved in [...]
Events on 2025-11-02
NextGen UGM 2025
2 Nov 25
TN
Events on 2025-11-05
Events on 2025-11-17
MEDICA 2025
17 Nov 25
40474 Düsseldorf

Events

Articles Latest News

National Standard Unveiled for Scalable, Safe Healthcare AI

EMR Industry

Researchers at Duke University School of Medicine have developed two innovative frameworks to assess the performance, safety, and reliability of large language models in healthcare.

Published in npj Digital Medicine and the Journal of the American Medical Informatics Association (JAMIA), two new studies present a novel approach to ensuring that AI systems used in clinical environments adhere to the highest standards of quality, safety, and accountability.

As large language models become more integrated into healthcare—supporting tasks such as clinical note generation, conversation summarization, and patient communication—health systems face increasing challenges in evaluating these technologies in a rigorous yet scalable way. The Duke University-led research, headed by Chuan Hong, Ph.D., assistant professor in Biostatistics and Bioinformatics, aims to address this critical need.

The study published in npj Digital Medicine introduces SCRIBE, a structured evaluation framework for Ambient Digital Scribing tools. These AI-driven systems are designed to generate clinical documentation by capturing real-time conversations between patients and providers. SCRIBE combines expert clinical review, automated performance scoring, and simulated edge-case testing to assess tools across key metrics such as accuracy, fairness, coherence, and resilience.

“Ambient AI has significant potential to ease documentation burdens for clinicians,” Hong noted. “But careful evaluation is crucial. Without it, there’s a risk of deploying systems that introduce bias, omit vital details, or compromise care quality. SCRIBE is built to safeguard against those risks.”

A second, related study published in JAMIA introduces a complementary framework for evaluating large language models integrated into the Epic electronic medical record system, specifically those used to generate draft responses to patient messages. The study assesses these AI-generated replies by comparing clinician feedback with automated evaluation metrics, focusing on attributes such as clarity, completeness, and safety.

While the models demonstrated strong performance in tone and readability, the study identified notable gaps in response completeness—highlighting the critical need for ongoing evaluation in real-world settings.

“This research helps bridge the gap between cutting-edge algorithms and meaningful clinical application,” said Michael Pencina, Ph.D., Chief Data Scientist at Duke Health and co-author of both studies. “It underscores that responsible AI implementation requires rigorous, ongoing evaluation as part of the technology’s entire life cycle—not just as a final step.”

Together, these two frameworks provide a robust foundation for the responsible integration of AI in healthcare. They equip clinical leaders, developers, and regulators with the tools necessary to evaluate AI models prior to deployment and to continuously monitor their performance—ensuring that these technologies enhance care delivery without compromising patient safety or trust.