Main Session
Sep 30
PQA 09 - Hematologic Malignancies, Health Services Research, Digital Health Innovation and Informatics

3748 - Integrating Histopathology and Spatial Transcriptomics for Tumor Microenvironment Analysis and Personalized Radiotherapy

04:00pm - 05:00pm PT
Hall F
Screen: 13
POSTER

Presenter(s)

Lei Xing, PhD, FASTRO - Stanford University, Stanford University, CA

X. Xing1, and L. Xing2,3; 1Stanford University, Palo Alto, CA, 2Stanford University, Stanford, CA, 3Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA

Purpose/Objective(s): The tumor microenvironment (TME) plays a crucial role in radiotherapy response and treatment planning by influencing tumor progression and therapeutic resistance. We aim to develop a robust AI model that integrates morphological features from histopathology slides (HE) and molecular insights from spatial transcriptomics (ST) to identify spatially resolved biomarkers that could inform personalized radiotherapy strategies and improve patient outcomes.

Materials/Methods: We introduce a multi-modal learning framework that aligns two large-scale foundation models: CONCH (trained on 1.17 million pathology slides) and scGPT (trained on 33 million single-cell sequencing data). Instead of training a multi-modal model from scratch, we leverage these powerful single-modality models and align their feature spaces to enable cross-modal learning and mutual knowledge transfer. To achieve this, we propose a relation-informed contrastive alignment technique. We extract pathology patch features using CONCH and ST spot features using scGPT. To establish meaningful cross-modal correspondences, we measure similarity based on spatial proximity, pathological features, and genomic features. Each patch is paired with its top-k most similar spots as positive pairs, which are pulled closer in the shared feature space, while negative pairs are pushed apart. This contrastive fine-tuning process aligns the two foundation models, improving multi-modal representation learning, enhancing robustness to missing modalities, and enabling downstream tasks like cancer diagnosis and biomarker discovery.

Results: We trained our multi-modal network on five Xenium invasive breast cancer cases with paired HE and ST data and evaluated its performance in predicting ER, PR, and HER2 expression on 1,058 HE images from the BCNB dataset. We evaluated the model with only histopathology images using the pre-trained CONCH, CONCH-FT (finetuned by aligning with ST features extracted with an MLP model), and CONCH-scGPT-FT (mutually finetuned with the scGPT model). Results show that CONCH-scGPT-FT achieved the best performance in terms of AUC and Balanced Accuracy (BACC).

Conclusion: Integrating HE and ST enables more accurate gene mutation prediction, providing valuable insights for personalized radiotherapy. Our proposed alignment strategy effectively combines two single-modality foundation models, facilitating cross-modal knowledge transfer and mutual enhancement between HE and ST representations. The superior performance of CONCH-scGPT-FT demonstrates that leveraging multi-modal learning can improve the predictive power of each model, ultimately supporting more precise and individualized radiotherapy strategies.

Abstract 3748 - Table 1

Method

ER

PR

HER2

AUC

BACC

AUC

BACC

AUC

BACC

CONCH

0.881

0.745

0.810

0.698

0.715

0.624

CONCH-FT

0.884

0.752

0.818

0.714

0.724

0.615

CONCH-scGPT-FT

0.890

0.762

0.824

0.725

0.716

0.628