Main Session
Sep 30
QP 10 - DHI 2: Quick Pitch: The Digital Revolution in Radiation Oncology: AI Models for Enhanced Patient Care

1059 - Multitask Double Machine Learning for Counterfactual Estimation of Survival and Toxicity in Locally Advanced Non-Small Cell Lung Cancer Chemoradiotherapy

01:15pm - 01:20pm PT
Room 20/21

Presenter(s)

Sang Ho Lee, PhD - University of Pennsylvania, Philadelphia, PA

S. H. Lee1, N. Yegya-Raman1, R. Caruana2, J. D. Bradley3, G. D. Kao1, S. J. Feigenberg1, and Y. Xiao1; 1Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 2Microsoft Research, Redmond, WA, 3University of Pennsylvania/Abramson Cancer Center, Philadelphia, PA

Purpose/Objective(s): This study develops multitask (MT) double machine learning (DML) to estimate counterfactual treatment effects on survival and toxicity in locally advanced non-small cell lung cancer (LANSCLC) chemoradiotherapy (CRT). We hypothesize that MT DML improves precision and robustness by leveraging shared information across endpoints.

Materials/Methods: 663 LANSCLC patients received proton (n=260) or photon (n=403) RT with concurrent chemotherapy (CTX; carboplatin/paclitaxel (CP), n=426; others, n=237), including 179 with consolidation immunotherapy (CIO). The cohort was split into training (n=464) and testing (n=199) sets. We extracted 1,200 clinical, geometric, DVH, immuno-hematological DVH, and absolute lymphocyte/neutrophil count (ALC/ANC) kinetic features. MT LASSO with five-fold cross-validation (5FCV) selected features for jointly classifying five survival outcomes, dichotomized at two years, and three toxicities. Survival outcomes included overall survival (OS), death without progression (DWP), progression-free survival (PFS), and locoregional/distant failure-free survival (LFFS/DFFS), while toxicities included lymphopenia (LP), unplanned hospitalization (UH), and radiation pneumonitis (RP). MT gradient boosting machine (GBM) models predicted all outcomes, validated via 5FCV ROC AUC and ensemble test predictions. Counterfactual estimation used explainable boosting machines for propensity scoring, while MT GBM predicted outcomes for each treatment group. Average treatment effect was derived using doubly robust pseudo-outcomes, integrating outcome differences with inverse-probability weighting.

Results: MT LASSO identified 11 key features, with MT GBM ranking max ANC:ALC ratio, GTV, and aT (tumor a from the LQ model) as the top three. Risk increased in all outcomes when max ANC:ALC ratio >18 and aT <0.15, and except RP when GTV >79 cc. On the test set, MT GBM achieved AUCs of 0.73 (OS), 0.76 (DWP), 0.63 (PFS), 0.64 (LFFS), 0.59 (DFFS), 0.72 (LP), 0.72 (UH), and 0.67 (RP). Proton RT was associated with longer OS (5FCV: -12.7%, p=0.475; test: -13.3%, p=0.502), fewer UH (-28.4%, p=0.466; -26.7%, p=0.482), more LP (10.4%, p=0.468; 11.4%, p=0.493), shorter PFS (5.8%, p=0.497; 5.5%, p=0.485), and lower RP (-6.6%, p=0.475; -5.6%, p=0.515). CP was linked to longer OS (-4.4%, p=0.501; -4.6%, p=0.576), fewer UH (-15.1%, p=0.476; -16.5%, p=0.493), shorter PFS (0.8%, p=0.785; 3.5%, p=0.532), and lower RP (-0.7%, p=0.851; -7.1%, p=0.485). CIO showed inconsistent OS effects (-2.6%, p=0.875; 12.7%, p=0.646), fewer UH (-26.1%, p=0.521; -23.6%, p=0.515), a lower RP trend (-5.6%, p=0.650; -9.6%, p=0.624), and longer PFS (-14.4%, p=0.516; -5.7%, p=0.579).

Conclusion: MT DML enables joint counterfactual survival and toxicity estimation in LANSCLC, revealing distinct treatment effects across endpoints. Its ability to handle complex datasets makes it a valuable tool for clinical decision-making, with potential applications in other disease sites.