With the rapid adoption of telemedicine and patient messaging systems, healthcare providers are facing unprecedented workloads. The study OPTIC: Optimizing Patient-Provider Triaging & Improving Communications in Clinical Operations proposes an AI-driven approach to reduce the burden of handling patient messages by employing GPT-4 for data labeling and BERT for classification model distillation.
Key Takeaways
Problem Addressed: The sharp rise in patient messages to healthcare providers increases administrative workload, causing delays in clinical decision-making and contributing to physician burnout.
Solution Proposed: The OPTIC system leverages GPT-4 for high-quality message labeling and BERT for efficient classification of messages into administrative or clinical categories.
Dataset Used: The study utilized 405,487 patient messaging encounters from Johns Hopkins Medicine spanning January to June 2020.
Model Performance: The BERT-based classification model achieved 88.85% accuracy, 88.29% sensitivity, and 89.38% specificity in distinguishing between administrative and clinical messages.
Deployment: The model was successfully integrated into Epic’s Nebula Cloud Platform, demonstrating its feasibility for real-world healthcare applications.
Overview
The Growing Burden of Patient Messaging
The COVID-19 pandemic accelerated the adoption of digital communication in healthcare. Patient portals like Epic MyChart became essential for scheduling, consultations, and prescription refills. However, a 157% increase in patient messages (Holmgren et al., 2022) has overwhelmed providers, resulting in delays in responding to clinical messages and worsening physician burnout.
Most existing electronic messaging systems lack triage mechanisms, treating all messages equally regardless of urgency. This inefficiency results in urgent clinical issues being buried under routine administrative messages. Studies indicate that over 50% of patient messages are administrative and could be redirected to non-clinical staff, freeing up physicians’ time (Schuetz, 2021).
The lack of automated tools to manage the influx of messages has created inefficiencies in patient care management. Many messages are administrative and could be handled by other healthcare team members, but the absence of a structured system for sorting these messages leads to delays in addressing urgent issues (Schuetz, 2021)[https://scholarworks.uvm.edu/fmclerk/693].
Why NLP-Based Solutions Are Needed
Manually sorting messages is impractical due to sheer volume, requiring automated classification. Traditional rule-based approaches fail to adapt to medical language complexities, necessitating machine learning-based solutions.
Challenges in automated classification include:
- Variability in Patient Messages: Patients have different communication styles and medical knowledge.
- Complexity of Clinical Language: Messages often contain medical jargon, acronyms, and mixed-content (e.g., a request for test results along with an appointment reschedule).
- Need for High Interpretability: Providers require transparent AI models that classify messages accurately and explain their reasoning.
To address these issues, OPTIC employs GPT-4 for advanced data labeling and BERT for real-time classification, offering a scalable and compute efficient approach.
Why It’s Important
The development of OPTIC addresses a critical need in healthcare settings by providing a scalable and efficient tool for triaging PMARs (Patient Medical Advice Requests). This tool can significantly reduce the administrative burden on healthcare providers, improving patient care coordination and streamlining workflows. The integration of OPTIC into Epic's Nebula Cloud Platform demonstrates its practical applicability and potential for widespread adoption in healthcare systems.
Reducing Physician Burnout and Improving Workflow: Administrative overload is a major factor in physician burnout (West et al., 2018). By automating message classification, OPTIC allows providers to focus on critical patient care rather than administrative tasks.
Enhancing Patient Response Time: Faster classification enables quicker response times for urgent medical inquiries, improving patient outcomes.
Scalability Across Healthcare Systems: Since OPTIC is deployed through Epic’s Nebula Cloud Platform, it can be easily scaled across multiple hospitals and healthcare systems, making it a practical solution beyond an academic setting.
Summary of Results
OPTIC's BERT model achieved an impressive 88.85% accuracy in classifying messages as "Admin" or "Clinical," with a sensitivity of 88.29% and specificity of 89.38%. The model's performance was further validated through BERTopic analysis, which identified 81 distinct topics with over 80% accuracy in 58 topics. This innovative tool has been successfully deployed through Epic's Nebula Cloud Platform, demonstrating its practical applicability in real-world healthcare settings.
Data Collection and Processing
- Source: Johns Hopkins Medicine’s Epic MyChart portal
- Period: January – June 2020
- Total Messages: 405,487
- Labeling Method: GPT-4 few-shot learning approach to create a high-quality labeled dataset (~35K messages)
Model Development and Testing
1. GPT-4 Prompt Engineering for Data Labeling
- GPT-4 was used in few-shot and zero-shot settings to label messages as Administrative or Clinical.
- Four different prompt engineering strategies were tested.
- The best performing approach (Few-Shot, 200 sample messages) achieved 99% accuracy in a validation set of 2,000 messages
2. Model Distillation Using BERT
- BERT was trained on the GPT-labeled dataset, reducing cost and computational resources compared to running GPT-4 for real-time classification.
- Training Set: 33,861 samples
- Validation Set: 3,387 samples
- Test Set: 3,454 samples
- Final Performance Metrics:
- Accuracy: 88.85%
- Sensitivity: 88.29%
- Specificity: 89.38%
- F1 Score: 0.8842
3. Topic Analysis Using BERTopic
- 81 distinct topics were identified in patient messages.
- The BERT model correctly classified messages with >80% accuracy across 58 topics, confirming its generalizability across different medical scenarios.
4. Deployment in Epic Nebula Cloud
- The model was packaged as part of Epic’s SaaS infrastructure.
- Integrated into Epic Hyperspace (used for MyChart messaging).
- Allows real-time triaging of patient messages across large healthcare networks.
Conclusion
OPTIC provides a scalable, AI-powered triage system to help healthcare providers manage patient messages (PMARs) efficiently. By combining GPT-4 for high-quality labeling with BERT for real-time classification, the model offers high accuracy, reduces physician workload, and improves patient communication.
Future Work
- Improving Interpretability: Enhancing explainability of BERT’s decisions.
- Handling Mixed-Content Messages: Developing models that can triage messages containing both administrative and clinical content.
- Expansion to Other Specialties: Adapting the system for cardiology, dermatology, and emergency medicine.
By leveraging state-of-the-art NLP techniques, this study demonstrates the transformative potential of AI in healthcare operations, ensuring smarter, faster, and more efficient patient-provider communication.
References
(Selected from paper citations)
- Holmgren A et al., 2022. Assessing the impact of the COVID-19 pandemic on clinician ambulatory electronic health record use. J Am Med Inform Assoc.
- West C et al., 2018. Physician burnout: contributors, consequences and solutions. J Intern Med.
- Schuetz S., 2021. Impact of MyChart Communication on Provider Burden.
- Devlin J et al., 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.
OPTIC: A Novel Machine Learning Approach to Reducing Physician Burden in Digital Communication