Main Session
Sep
30
PQA 07 - Genitourinary Cancer, Patient Safety, Nursing/Supportive Care
3200 - The Impact of Neck Tilt on Deep Learning Autosegmentation Accuracy for Head and Neck Radiotherapy
Presenter(s)
Jamison Brooks, PhD - Mayo Clinic Rochester, Rochester, MN
J. L. Brooks1, W. S. Harmsen2, D. M. Routman1, E. J. Tryggestad1, and D. J. Moseley1; 1Department of Radiation Oncology, Mayo Clinic, Rochester, MN, 2Department of Biostatistics and Health Sciences Research, Mayo Clinic, Rochester, MN
Purpose/Objective(s):
Deep learning autosegmentation (DLAS) models are widely used for CT analysis, but their accuracy in cases of altered head and neck (HN) anatomy or abnormal patient positioning, such as kyphosis-related neck tilt, remains unclear. This study quantifies anterior-posterior neck tilt using cervical spine autocontouring and evaluates its impact on DLAS accuracy for common organ at risk (OAR) contours for HN radiotherapy. We hypothesize that abnormal neck tilt negatively affects DLAS segmentation performance for OAR contours.Materials/Methods:
CT scans from 35 head and neck cancer patients treated at two institutions were retrospectively analyzed. Images were acquired using scanners from a technology company with voxel dimensions of 1.27 mm × 1.27 mm × 2 mm at 120 kVp. Manually delineated, highly curated organ-at-risk (OAR) contours served as the gold standard for comparison against contours generated by seven FDA-approved deep learning auto-segmentation (DLAS) models. Neck tilt was quantified by segmenting cervical spine vertebrae using a publicly available autosegmentation tool (TotalSegmentator). A linear best-fit line was calculated along the C1–C4 spinal cord contour in the sagittal plane, and the angle between this line and the longitudinal (superior-inferior) axis defined the neck tilt. Tilt values between -5° and 20° were classified as normal, with a histogram used to categorize patients into normal (n=27) or abnormal (n=8) tilt groups. Contour accuracy for the brainstem, parotid glands, submandibular glands, optic nerves, and brachial plexuses was evaluated using Mean Distance to Agreement (MDA) relative to the gold standard. The two tailed Wilcoxon Rank Sum test was used to identify significance.Results:
Among the evaluated structures, only the parotid glands showed consistent differences in MDA across all DLAS models. Patients with abnormal neck tilt had significantly lower parotid segmentation accuracy than those with normal positioning (p < 0.05 for at least one parotid gland across all vendors). The degree of performance degradation varied among DLAS models, with some more affected by tilt-related inaccuracies than others.Conclusion:
These findings highlight the importance of considering patient positioning when evaluating DLAS performance and suggest a need for improved robustness in segmenting structures susceptible to deformation.