3725 - Leveraging Temporal Model for Early Detection of ED Visits in Locally Advanced Head and Neck Cancer Patients Using Unstructured EHR Data
Presenter(s)
B. Srinivas1, A. Gulati2, S. Anil1, B. Neo3, Y. Interian3, A. Park2, J. W. Chan4, and H. Lin4; 1University of California San Francisco, Department of Radiation Oncology, San Francisco, CA, 2University of California San Francisco, Department of Otolaryngology Head and Neck Surgery, San Francisco, CA, 3University of San Francisco, Department of Data Science, San Francisco, CA, 4Department of Radiation Oncology, University of California San Francisco, San Francisco, CA
Purpose/Objective(s): Microvascular free flap surgeries (MVFFS) restore function and aesthetics after tumor resection in H&N cancer care. Unplanned ED and hospital admissions may occur due to surgical complexity, post-op challenges, radiation side effects, and co-morbidities. Timely prediction is crucial for improving outcomes, reducing costs, and optimizing resources. This study develops a Long Short-Term Memory (LSTM) based model using unstructured EHR data to predict ED visits, enabling proactive clinician intervention and tailored patient care.
Materials/Methods: This retrospective cohort study included 203,899 physician notes from patients treated at a single institution between 2011 and 2023 who underwent MVFFS for advanced H&N cancers. Patient records were curated based on the primary treatment completion date, extracted using HIPAA-compliant GPT-4o applied to oncology notes. These dates anchored three sequential time windows: Time Step 0 (all notes up to treatment completion), Time Step 1 (notes recorded during the 3 months post-treatment), and Time Step 2 (notes recorded during the subsequent 3 months). A binary ED visit label was assigned based on the presence of Emergency Medicine notes. The training dataset comprised 316 patients and the testing dataset 79, with each patient’s case containing multiple physician notes across defined time windows. The predictive model is a LSTM network capturing temporal dependencies in sequential EHR data. The architecture comprises two LSTM layers with 50 and 25 hidden units. Batch normalization follows the first LSTM layer, and dropout (0.1 rate) reduces overfitting. The final output layer is a fully connected linear layer for binary classification. Model performance was assessed using accuracy, precision, recall, F1-score, and AUROC.
Results: The median follow-up time was 4.5 years [IQR: 1.9–6.6], with 24.5% of patients experiencing an ED visit. The LSTM model’s performance improved with more temporal data, increasing accuracy from 77.6% at Time Step 0 to 84.5% at Time Step 2, indicating its ability to capture evolving patterns critical to predicting ED visits. SHAP interpretation for Time Step 2 identified key factors: metastatic disease progression, falls, injuries, and medication complications were linked to higher ED visit risk, while stable chronic conditions and consistent home care reduced risk.
Conclusion: The LSTM-based model effectively predicts ED visits on a rolling basis in H&N cancer patients using unstructured EHR data, identifying risk and protective factors to support proactive management, reduce ED visits, and improve care.
Abstract 3725 - Table 1Time Step | Notes Description | Accuracy | Precision | Recall | F1 | AUC |
ts0 , ts1 , ts2 | All time steps | 0.81 | 0.84 | 0.75 | 0.83 | 0.85 |
ts0 | Up to primary course completion | 0.76 | 0.69 | 0.75 | 0.72 | 0.71 |
ts1 | 3 months post ts0 | 0.81 | 0.77 | 0.77 | 0.72 | 0.81 |
ts2 | 3 months post ts1 | 0.84 | 0.80 | 0.80 | 0.80 | 0.82 |