Sep 29

SS 13 - DHI 1: The Digital Revolution in Radiation Oncology: AI Models for Enhanced Patient Care

179 - Automated Clinical Target Volume Contour Quality Assurance for the TROG 08.08 TOPGEAR Trial

08:20am - 08:30am PT

Room 20/21

Presenter(s)

Phillip Chlap, MS - Radformation, New York, NY

P. Chlap^1,2, M. T. Lee^2,3, T. Leong⁴, M. Field^1,2, J. Dowling⁵, H. Min^3,5, J. Chu^4,6, J. Tan⁴, P. K. Tran⁴, T. Kron^6,7, A. Haworth⁸, L. E. Court⁹, M. A. Ebert^10,11, S. Vinod^1,2, and L. Holloway^1,2; ¹UNSW and Ingham Institute for Applied Medical Research, Sydney, Australia, ²Liverpool and Macarthur Cancer Therapy Centre, Sydney, Australia, ³Faculty of Medicine, South Western Sydney Clinical School, UNSW, Sydney, Australia, ⁴Peter MacCallum Cancer Centre, Melbourne, Australia, ⁵Australian e-Health Research Centre, CSIRO, Brisbane, Australia, ⁶Sir Peter MacCallum Department of Oncology, the University of Melbourne, Melbourne, Australia, ⁷Peter MacCallum Cancer Centre, Melbourne, VIC, Australia, ⁸School of Physics, University of Sydney, Sydney, Australia, ⁹Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, ¹⁰School of Physics and Astrophysics, University of Western Australia, Perth, Australia, ¹¹Radiation Oncology, Sir Charles Gairdner Hospital, Perth, Australia

Purpose/Objective(s): Quality Assurance (QA) is crucial in radiotherapy (RT) clinical trials to ensure protocol adherence, especially for target volumes, as violations can impact patient outcomes. However, manual QA is resource-intensive, limiting reviews to a subset of patients. The objective was to test an automated contour QA approach that flags violations for manual review using the TROG 08.08 TOPGEAR trial dataset, which evaluated preoperative chemoradiotherapy alongside perioperative chemotherapy for resectable gastric cancer. TOPGEAR’s Clinical Target Volume (CTV) is anatomically defined and complex to contour. Designed for prospective trials with limited initial data, our approach used a small training set.

Materials/Methods: To evaluate our approach, 93 cases from the TOPGEAR dataset were selected: 10 for training, 33 for validation and parameter tuning, and 50 for holdout testing. Five radiation oncologists contoured the CTV on the training set with a consensus workshop held to ensure protocol adherence. The clinical CTV definitions, including both passing and violating cases, were used for validation and testing (Table 1).

A 3D nnUNet was trained on the STAPLE-combined volume of 5 observers. To improve CTV segmentation accuracy, an anatomical label map generated using TotalSegmentator, including the duodenum, pancreas, and stomach, was added as an input channel. A probabilistic UNet model was then trained to capture inter-observer variability, predicting the acceptable CTV range with an uncertainty band using the image and nnUNet segmentation as inputs.

The models were applied to the validation and test sets. Two metrics, contour fit and distance-to-band, were evaluated on the validation set to compare the clinical CTV to the uncertainty band, detecting under-contouring, over-contouring, and both combined. The metric with the highest AUC-ROC was selected, with a threshold set to detect at least 90% of violations and was then applied to the test set for final evaluation.

Results: The best-performing metric for detecting CTV violations in the validation set was the distance-to-band for under-contouring, with an AUC of 0.84. A threshold set for a true positive rate (TPR) of 0.9 (22/24) resulted in a false positive rate (FPR) of 0.44 (14/32). When applied to the test set, an AUC of 0.88 was achieved, with a TPR of 0.91 (31/34) and an FPR of 0.39 (19/49).

Conclusion: Our automated contour QA approach for TOPGEAR showed potential to identify over 90% of violating CTVs and reduce the need for manual QA with over 50% passing CTVs detected.

Abstract 179 - Table 1: Dataset breakdown with results of manual and automated QA

	Training	Validation	Testing
Cases	10	33	50
Total CTVs	50	56	83
Manual Trial QA
- Pass	50	32	49
- Violation	-	24	34
Automated QA
- Correct Pass (TN)	-	16	30
- Correct Violation (TP)	-	22	31
- Missed Pass (FP)	-	14	19
- Missed Violation (FN)	-	2	3
- Accuracy	-	0.7	0.73
- Sensitivity	-	0.92	0.91
- Specificity	-	0.53	0.61
- F-Score	-	0.73	0.74