3573 - Integration of Dose, Imaging Biomarkers, and 3D Topological Information via Transformer-GNN for Multimodal Learning in Predicting Radiation-Induced Lung Injury in Elderly Esophageal Cancer Patients

02:30pm - 03:45pm PT

Hall F

Screen: 11

POSTER

Presenter(s)

Xin Yang, MS - Chongqing University Cancer Hospital, Shapingba, Chongqing

X. Yang¹, H. Zhang², B. Feng¹, B. Liang³, and F. Jin¹; ¹Radiation Physics Center, Chongqing University Cancer Hospital, Chongqing, China, ²Chongqing University Three Gorges Hospital, Chongqing, China, ³People's Hospital of Yilong County, Yilong, Sichuan, China

Purpose/Objective(s): This study aims to integrate three-dimensional dose distribution, clinical features, pretreatment CT imaging, and radiomics features to construct a Transformer-GNN (Transformer combined with Graph Neural Network) deep learning model. The goal is to evaluate its performance in predicting radiation-induced lung injury (RILI) in elderly esophageal cancer patients and to compare it with traditional Transformer models, thereby validating the advantages of multimodal data fusion.

Materials/Methods: A retrospective analysis was conducted on 190 elderly esophageal cancer patients (=65 years) treated with radiotherapy between 2016 and 2023, including 93 RILI cases (RP=Grade 2). Dosimetric parameters (e.g., MLD, lung V5, prescription dose) were extracted from DVH, while clinical features (e.g., age, gender, smoking history, comorbidities, TNM stage) were recorded. From pretreatment CT images, 1526 radiomics features were extracted, with 94 key features selected via cluster analysis. A graph adjacency matrix was constructed by fusing CT images and 3D dose distribution to capture spatial relationships. A dual-branch Transformer-GNN model was proposed: the Transformer branch processes one-dimensional features (dosimetric, clinical, radiomics), while the GNN branch explores local anatomical correlations in 3D data. A feature fusion layer integrates multimodal information, compared to a traditional Transformer model using only one-dimensional features. Performance was evaluated via 10-fold cross-validation, focusing on AUC, sensitivity, and specificity, to validate the multimodal fusion advantages of Transformer-GNN.

Results: Cluster analysis identified 94 radiomics features significantly associated with RILI. In single-modal prediction, the traditional Transformer model achieved an AUC of 0.802 (95% CI: 0.735–0.851) by integrating dosimetric, clinical, and radiomics features. In contrast, the dual-branch Transformer-GNN model demonstrated superior performance, achieving an overall prediction AUC of 0.869 (95% CI: 0.824–0.923), with a sensitivity of 0.851 (95% CI: 0.731–0.941) and specificity of 0.832 (95% CI: 0.752–0.913). Furthermore, in the generalization ability assessment, the model exhibited an average AUC of 0.857 in 10-fold cross-validation, demonstrating significantly greater stability compared to the traditional model (average AUC = 0.799).

Conclusion: The proposed Transformer-GNN model significantly improves the accuracy and robustness of RILI prediction in elderly esophageal cancer patients by integrating one-dimensional features and three-dimensional graph structural data. The GNN module effectively captures spatial correlations in lung anatomy, overcoming the limitations of traditional Transformer models in modeling local structures. Future research could enhance performance by incorporating delta radiomics data, offering decision support for personalized radiotherapy in clinical practice.