Events Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
M
T
W
T
F
S
S
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
24
25
26
27
28
29
30
1
2
3
MedInformatix Summit 2014
2014-07-22 - 2014-07-25    
All Day
MedInformatix is excited to present this year’s meeting! 07/22 Tuesday Focus: Product Development Highlights:Latest Updates in Product Development, Interactive Roundtables, and More. 07/23 Wednesday Focus: Healthcare Trends [...]
MMGMA 2014 Summer Conference
2014-07-23 - 2014-07-25    
All Day
Mark your calendar for Wednesday - Friday, July 23-25, and join your colleagues and business partners in Duluth for our MMGMA Summer Conference: Delivering Superior [...]
This is it: The Last Chance for EHR Stimulus Funds! Webinar
2014-07-31    
10:00 am - 11:00 am
Contact: Robert Moberg ChiroTouch 9265 Sky Park Court Suite 200 San Diego, CA 92123 Phone: 619-528-0040 ChiroTouch to Host This is it: The Last Chance [...]
RCM Best Practices
2014-07-31    
2:00 pm - 3:00 pm
In today’s cost-conscious healthcare environment every dollar counts. Yet, inefficient billing processes are costing practices up to 15% of their revenue annually. The areas of [...]
Events on 2014-07-22
MedInformatix Summit 2014
22 Jul 14
New Orleans
Events on 2014-07-23
MMGMA 2014 Summer Conference
23 Jul 14
Duluth
Events on 2014-07-31
Articles

Large models identify social determinants in records

Social determinants of health (SDoH) significantly influence patient outcomes, yet their documentation is frequently incomplete or absent in the structured data of electronic health records (EHRs). The utilization of large language models (LLMs) holds promise in efficiently extracting SDoH from EHRs, contributing to both research and clinical care. However, challenges such as class imbalance and data limitations arise when handling this sparsely documented yet vital information.

In our investigation, we explored effective approaches to leverage LLMs for extracting six distinct SDoH categories from narrative EHR text. The standout performers included the fine-tuned Flan-T5 XL, achieving a macro-F1 of 0.71 for any SDoH mentions, and Flan-T5 XXL, attaining a macro-F1 of 0.70 for adverse SDoH mentions. The incorporation of LLM-generated synthetic data during training had varying effects across models and architectures but notably improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23).

Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in their respective settings, except for GPT4 with 10-shot prompting for adverse SDoH. These fine-tuned models exhibited a reduced likelihood of changing predictions when race/ethnicity and gender descriptors were introduced to the text, indicating diminished algorithmic bias (p < 0.05). Notably, our models identified 93.8% of patients with adverse SDoH, a significant improvement compared to the mere 2.0% captured by ICD-10 codes. These results highlight the potential of LLMs in enhancing real-world evidence related to SDoH and in identifying patients who could benefit from additional resource support.