2523 - A Radiomics Method Based on Data Augmentation for Preoperative Prediction of Immature and Mature Teratoma
Presenter(s)
J. Yang1,2, H. Zhou3, Z. Sun1,2, Y. Yan1,2, F. Zhao1,2, L. Wu1,2, Z. LU1,2, and S. Yan1,2; 1Department of Radiation Oncology, the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China, 2Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China, 3Department of Radiology, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Purpose/Objective(s): Develop and validate a radiomics-based method with data augmentation, to predict immature teratoma and mature teratoma using limited preoperative computed tomography (CT) data.
Materials/Methods: Preoperative CT data of a total of 448 teratoma patients from May 2008 to September 2020 were collected. According to the sequence of examination time, these data were divided into a training cohort (containing 318 patients) and a validation cohort (containing 130 patients). After manually delineating tumor regions as regions of interest (ROIs), the inner-cutout technique embedded within a deep learning model (Models Genesis) was implemented on the original ROIs in the training cohort for data augmentation. The number of original ROIs was expanded to ten times the initial amount, thus generating derivative ROIs. All ROIs were resampled to a uniform spatial resolution, and then 1652 radiomics features were extracted from these ROIs. Through feature selection including Pearson correlation analysis (with a threshold set at 0.85) and minimum redundancy maximum relevance (mRMR) algorithm, solid features (specifically, the count of solid component voxels and their proportion within ROIs) were integrated. Finally, a prediction model was constructed using a logistic regression classifier, and its performance was evaluated in the validation cohort.
Results: The training cohort contained 53 immature teratoma patients and 265 mature teratoma patients, while the validation cohort included 26 immature teratoma patients and 104 mature teratoma patients. A total of 3498 ROIs were obtained in the training cohort. After feature extraction and feature selection, 5 radiomics features were retained, and 2 solid features were integrated. The proposed model achieved area under receiver operating characteristic curve (AUC) values of 0.835 and 0.797 in the training cohort and the validation cohort, respectively. In contrast, the model constructed using original ROIs (without data augmentation) achieved AUC values of 0.817 and 0.774 in the training cohort and the validation cohort, respectively. Moreover, the performance of our proposed model with logistic regression was better than that of the models with other machine learning classifiers.
Conclusion: The proposed method demonstrated satisfactory prediction performance for immature teratoma and mature teratoma, outperforming the method without data augmentation. Therefore, our method is expected to overcome the problem of limited teratoma data in clinic. It has the potential to guide the non-invasive preoperative prediction of histological types on teratoma.