2762 - Head and Neck Cancer Radiotherapy Toxicity Prediction across Intra-Fractional Cone-Beam CTs Using Vision Transformers Networks
Presenter(s)
G. Hénique1, C. Bang III2, W. Le3, F. Nguyen4, E. J. Filion5, D. Soulieres6, B. O'Sullivan7, A. Christopoulos8, E. Bissada9, T. Ayad8, L. Guertin9, A. Lalonde10, D. Markel9, S. Kadoury11, and H. Bahig12; 1Centre Hospitalier de l'Université de Montréal (CHUM) Research Center, Montreal, QC, Canada, 2University of Montreal, Montreal, QC, Canada, 3CRCHUM (The University of Montreal Hospital Research Centre), Montreal, QC, Canada, 4Department of Radiation Oncology, Centre Hospitalier de l'Université de Montréal (CHUM), Montréal, QC, Canada, 5Department of Radiation Oncology, Centre Hospitalier de l'Université de Montréal, Montreal, QC, Canada, 6Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, QC, Canada, 7Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, 8Division of Otolaryngology-Head & Neck Surgery, Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, QC, Canada, 9CHUM (The University of Montreal Hospital Centre), Montreal, QC, Canada, 10Universite de Montreal, Montreal, QC, Canada, 11Polytechnique Montreal, Montreal, QC, Canada, 12Department of Radiation Oncology, Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, QC, Canada
Purpose/Objective(s): Radiotherapy (RT) is essential in head and neck cancer (HNC) treatments but often causes significant toxicity. Previous studies integrating clinical imaging with machine learning demonstrated promising results for predicting toxicity, however none have investigated spatial and temporal dynamics in anatomical changes through a sequence of CBCTs. By analyzing the contribution from each scan and cross fraction interactions using Vision Transformers, we aim to improve the prediction capability for three major HNC RT toxicities: reactive feeding tube placement (RFT), hospitalization and radio necrosis (RN).
Materials/Methods: In this study, 1012 HNC patients treated with radical intent RT between 2016 and 2022 at CHUM were retrospectively analyzed for radiological and clinical data. Daily CBCTs were rigidly registered to the planning imaging, mapping a 3D+t sequence following the anatomy of the patient. A Swin4D vision transformer network was trained to encode the spatial-temporal information, which was integrated along with pre-treatment radiological and clinical data in a multimodal fusion scheme. Each toxicity outcome was classified using 5-fold stratified cross-validation binary decision using the cross-entropy loss to account for a class imbalance. The predictive performance of each component and their combination was evaluated.
Results: The cohort comprised 76% men and 24% women, with a median age of 64 years. Tobacco exposure, past and ongoing, were identified in 47% and 29% of participants. The primary tumor sites were distributed as follows: 54% oropharynx, 17% larynx, 11% oral cavity, 8% nasopharynx, 4% hypopharynx, 4% unknown primary, and 4% other sites, including tumors staged Tx-4b N0-3b M0-1. 46% of patients had p16 tumors. Overall, 55% received concurrent chemotherapy, 8.75% underwent induction chemotherapy, and 21% received post-operative radiotherapy. The incidence of RFT, hospitalization, and RN toxicities was 16.6%, 4.1%, and 4.6%, respectively.
The combination of the 3D+t information with the clinical data improved toxicity prediction over the baseline AUC: RFT (74.5% > 71.3%), hospitalization (69.7% > 65.9%) and RN (82.2% > 78.1%). Ablation analysis showed that selecting fraction increments of 2 between the initial CBCT and the 16th, 8 scans, yielded the best predictive performance early in treatment. Activation maps allowed to identify the most relevant anatomical regions and fractions that significantly affected the prediction of toxicity for a patient.Conclusion: We propose, to our knowledge, the first model exploiting 3D+t imaging data obtained from intra-fractional CBCTs, combined with clinical data, for toxicity prediction in HNC patients. The results demonstrate the benefit of multi-modal integration in anticipating patient toxicity during radiotherapy. Further studies will explore the interactions of selected organs at risk throughout the treatment, as their characteristics could be key to improve toxicity assessment.