Small cell lung cancer (SCLC) is one of the most aggressive subtypes of lung cancer, accounting for roughly 15 to 20% of all lung cancer cases. It is characterized by rapid tumor growth, early metastasis, frequent recurrence, and strong drug resistance. Despite advances in therapy, the five-year survival rate for SCLC remains below 10%, making improved prognostic tools and personalized treatment strategies an urgent clinical priority.
Current limitations: Existing clinical and pathological features used for prognosis and treatment planning in SCLC fall short, particularly in predicting individual patient responses and survival outcomes. Molecular approaches such as transcriptionally defined subtypes and tumor microenvironment profiling have improved our understanding of SCLC heterogeneity, but their clinical application is hampered by sample quantity and quality requirements, cross-platform reproducibility issues, and high cost.
The H&E opportunity: Hematoxylin and Eosin (H&E) staining is a universally available technique in pathology labs that produces high-resolution images capturing essential morphological features of tumor tissue. However, manual microscopic examination of H&E slides is labor-intensive and heavily dependent on pathologist expertise. Recent deep learning advances in computational pathology have enabled automated cancer detection, morphologic phenotype quantification, and patient survival stratification from H&E slides across several cancer types, but this potential has remained largely unexplored in SCLC.
This study proposes an unsupervised deep learning framework with contrastive clustering (DL-CC) to extract and analyze histomorphological features from H&E-stained images and develop a pathomics signature called PathoSig. The authors validated PathoSig across multicenter retrospective datasets to assess its robustness in predicting prognosis and evaluating chemoradiotherapy benefit in SCLC patients.
The study enrolled 380 surgically resected and pathologically confirmed SCLC specimens from two independent Chinese medical centers. The CHCAMS cohort (Cancer Hospital, Chinese Academy of Medical Sciences) included 286 patients collected from January 2005 to December 2016, comprising 240 pure SCLC (P-SCLC) cases and 46 combined SCLC (C-SCLC) cases. C-SCLC included combinations with squamous cell carcinoma (41.3%), adenocarcinoma (39.1%), large cell carcinoma (8.7%), and other subtypes. The PUCH cohort (Peking University Cancer Hospital) contributed 94 P-SCLC patients from January 2010 to April 2023.
Demographics and staging: Male predominance was consistent across cohorts: 70.0% in CHCAMS P-SCLC, 76.1% in CHCAMS C-SCLC, and 71.3% in PUCH. Median ages were 56.5, 60, and 59.5 years respectively. In the CHCAMS P-SCLC cohort, 58.7% were stage I-II and 41.3% were stage III-IV. In PUCH, 76.6% were stage I-II. Lymphatic metastasis was observed in 57.1%, 65.2%, and 39.4% of patients across the three groups. Recurrence rates were 49.2%, 50.0%, and 69.2%, and death event rates were 38.3%, 39.1%, and 62.8%.
Cohort design: The 286 CHCAMS cases were divided into a discovery cohort (n = 196), an internal validation cohort-1 for P-SCLC (n = 44), and an internal validation cohort-2 for C-SCLC (n = 46). All 94 PUCH patients served as the external independent validation cohort-3. The discovery cohort was further split 4:1 into training (n = 157) and testing (n = 39) datasets. Median follow-up durations were 4.00, 4.69, and 3.33 years for the three groups.
Image preprocessing: Pathologists extracted central tumor regions from whole slide images (WSIs) at 20x magnification, creating tissue microarrays (TMAs) of 1.5 mm diameter and 6 mm depth (1 to 4 cores per case). Each TMA was segmented into non-overlapping 224 x 224 pixel tiles using a watershed algorithm. Otsu thresholding removed tiles with less than 60% tissue coverage. A total of 73,199 tiles from the training dataset and 223,002 tiles across both cohorts were generated. Six random augmentation techniques (flipping, rotation, contrast adjustment, scaling, HSV adjustment, and noise addition) were applied to diversify the training data.
DL-CC architecture: The Deep Learning with Contrastive Clustering (DL-CC) framework consists of two modules. The Non-Redundant Vector Extractor module uses a pair of ResNet50 networks with shared weights processing distinct augmented images, mapping each 224 x 224 tile into a 2048-dimensional feature vector. It minimizes a combined loss function with diagonal loss (controlling scaling and rotation boundaries) and off-diagonal loss (controlling orthogonality). The Clustered Instance-Level Contrastive Feature Mapping module uses instance-level contrastive heads (maximizing similarity of positive pairs from the same tile, minimizing similarity of negative pairs from different tiles) and cluster-level contrastive heads (projecting features into a 50-dimensional latent space for effective clustering).
Total loss optimization: The framework simultaneously optimizes three loss components: representation loss, instance-level contrastive loss, and cluster-level contrastive loss, balanced by hyperparameter alpha. This self-supervised approach requires no manual labeling or delineation of target regions, eliminating potential human bias and removing the need to retrain the model as would be necessary with supervised or weakly-supervised solutions.
Feature quantification: The framework identified 50 tile-level histomorphological phenotype clusters (HPCs). These were visualized using UMAP (Uniform Manifold Approximation and Projection) dimensionality reduction, which confirmed that greater distance between clusters in UMAP space corresponded to more significant morphological differences. For each slide, the proportion of tiles belonging to each HPC relative to the total number of tiles was calculated as the quantified histomorphological feature vector.
Univariate Cox regression: Among the 50 HPCs analyzed for association with overall survival (OS) in the training dataset, four showed statistically significant associations. HPC19 was associated with improved OS (HR = 0.720, 95% CI 0.562-0.921, p = 0.009), while HPC20 (HR = 1.169, 95% CI 1.012-1.349, p = 0.033), HPC21 (HR = 1.141, 95% CI 1.020-1.275, p = 0.021), and HPC39 (HR = 1.268, 95% CI 1.090-1.474, p = 0.002) were associated with poor OS.
Multivariate analysis: When all four prognostic HPCs were tested together in multivariate regression, only HPC19 (HR = 0.743, 95% CI 0.576-0.958, p = 0.022) and HPC39 (HR = 1.250, 95% CI 1.070-1.450, p = 0.005) retained independent predictive power for OS. HPC20 and HPC21 lost significance once the mutual effects were accounted for. PathoSig was then constructed as a linear combination: PathoSig = (0.2398 x HPC39) + (-0.3393 x HPC19).
Risk stratification in the discovery cohort: Using an optimal risk score threshold determined by five-year ROC analysis in the testing dataset and a voting algorithm across multiple TMA slides per patient, the discovery cohort was stratified into high-, intermediate-, and low-risk groups with significantly different OS (three-way log-rank p = 0.030). The high-risk group had significantly poorer OS than the low-risk group (HR = 2.055, 95% CI 1.165-3.624, log-rank p = 0.011). Visualization confirmed that high-risk patient slides contained more HPC39 tiles and fewer HPC19 tiles compared to low-risk patient slides.
Internal validation: PathoSig was first tested on two internal independent cohorts that were not used during discovery or model training. In validation cohort-1 (P-SCLC, n = 44), the three-way log-rank p was 0.05, with high-risk patients showing significantly worse OS than low-risk (HR = 3.62, 95% CI 1.164-11.26, p = 0.026). In validation cohort-2 (C-SCLC, n = 46), the separation was even more pronounced (three-way log-rank p < 0.001), with high-risk versus low-risk HR = 9.478 (95% CI 2.531-35.492, p = 0.001).
External validation: In the external PUCH cohort (validation cohort-3, n = 94), PathoSig successfully distinguished patients into three risk groups with significantly different OS (three-way log-rank p = 0.038). High-risk patients had worse OS compared to low-risk patients (HR = 2.122, 95% CI 1.184-3.804, p = 0.012).
Multivariate independence: Critically, PathoSig remained an independent prognostic factor after adjusting for sex, age, smoking history, and AJCC/UICC stage across all three validation cohorts. The adjusted hazard ratios for the high-risk group were: validation-1 HR = 5.030 (95% CI 1.326-19.08, p = 0.018), validation-2 HR = 9.960 (95% CI 2.493-39.80, p = 0.001), and validation-3 HR = 2.484 (95% CI 1.336-4.615, p = 0.004). These results confirm that PathoSig provides prognostic information beyond what standard clinical variables already capture.
Postoperative chemoradiotherapy: In all four cohorts, patients who received chemoradiotherapy after surgery showed significantly shorter disease-free survival (DFS) when classified as high-risk by PathoSig compared to low- and intermediate-risk groups. Log-rank p values were 0.015 for the discovery cohort, 0.013 for validation-1, 0.043 for validation-2, and less than 0.001 for validation-3. Multivariate Cox analysis confirmed independent prognostic value of PathoSig for DFS (discovery: HR = 1.989, p = 0.019; validation-1: HR = 3.755, p = 0.022; validation-2: HR = 3.464, p = 0.041; validation-3: HR = 2.626, p = 0.001).
Recurrence rates: The high-risk group consistently displayed higher recurrence rates across all four cohorts: 73.1%, 75.0%, 90.0%, and 100.0%. By contrast, low-risk recurrence rates were 47.1%, 43.8%, 42.1%, and 50.0%, and intermediate-risk rates were 47.7%, 47.1%, 35.7%, and 45.5%. This pattern suggests that patients identified as high-risk by PathoSig are more likely to experience treatment resistance after postoperative chemoradiotherapy.
Preoperative chemoradiotherapy: For patients who received neoadjuvant chemoradiotherapy, the four cohorts were pooled due to small individual sample sizes. The high-risk group was associated with shorter DFS, with a five-year DFS rate of 37.5% compared to 61.9% for low-risk, though statistical significance was not reached (log-rank p = 0.26) owing to limited sample size. The high-risk group also had higher recurrence rates (62.5%) compared to intermediate-risk (35.7%) and low-risk (40.0%) groups.
Staging system refinement: PathoSig stratified patients within the same clinical stage into distinct risk groups with different outcomes. For early-stage (I/II) P-SCLC patients, high-risk versus low-risk groups showed significantly different OS and DFS (log-rank p < 0.001 for both). In late-stage (III/IV) P-SCLC patients, PathoSig also demonstrated significant prognostic value (OS: log-rank p = 0.025, DFS: log-rank p = 0.007). For C-SCLC, similar patterns held across both early- and late-stage subgroups. In the non-metastatic P-SCLC subgroup, high-risk PathoSig was associated with significantly shorter OS and DFS (log-rank p < 0.001 for both). In the metastatic P-SCLC subgroup, high-risk patients again showed poorer outcomes (OS: p = 0.051, DFS: p = 0.0065).
Molecular subtype stratification: The authors examined PathoSig against the four consensus transcription factor-based molecular subtypes: SCLC-A (ASCL1), SCLC-N (NEUROD1), SCLC-P (POU2F3), and SCLC-Y (YAP1), measured by immunohistochemistry in 286 CHCAMS patients. Notably, 50.9% of SCLC-A subtype patients fell into the low-risk PathoSig group, while the high-risk group had the highest proportion of SCLC-N subtype patients (40.0%).
Beyond molecular subtypes: Survival analysis integrating molecular subtypes with PathoSig showed that patients within the same molecular subtype were classified into different risk groups with different OS and DFS outcomes. For SCLC-A: log-rank p = 0.038 (OS) and 0.095 (DFS). For SCLC-N: log-rank p < 0.001 (OS) and 0.001 (DFS). For SCLC-P: log-rank p = 0.057 (OS). For SCLC-Y: log-rank p = 0.033 (OS). These findings indicate PathoSig captures prognostic information that transcription factor-based molecular subtyping alone does not provide.
Retrospective design and sample source: The study relies entirely on retrospective data from surgically resected specimens. Extensive-stage SCLC, which constitutes the majority of clinical SCLC cases, is typically diagnosed through biopsies rather than surgical resection. This means PathoSig has not been validated on biopsy tissue, and its generalizability to the broader population of non-surgical SCLC patients remains uncertain. Further validation using biopsy samples is critical before clinical deployment.
Rigid risk stratification rules: The slide-level risk classification approach uses a binary threshold from the five-year ROC analysis, which is then aggregated via a simple voting algorithm at the patient level. The authors acknowledge this approach is too rigid and lacks nuance, particularly given the intratumoral and intertumoral heterogeneity inherent in SCLC. More sophisticated models or continuous risk scoring strategies could better capture these complexities and improve predictive granularity.
Statistical power for preoperative therapy: The analysis of preoperative chemoradiotherapy response did not reach statistical significance (p = 0.26), likely due to the small number of patients who received neoadjuvant treatment across all cohorts. Pooling the cohorts was necessary but introduced additional heterogeneity. Larger prospective studies with adequate sample sizes for the neoadjuvant setting are needed.
Regulatory and implementation barriers: Translating the DL-CC framework and PathoSig into routine clinical practice requires medical licensing and regulatory approvals, which present significant hurdles. The source code is publicly available on GitHub, and the approach uses standard H&E-stained slides that are already part of routine pathology workflows, which lowers the technical barrier. However, prospective validation studies and integration into digital pathology platforms will be necessary before PathoSig can be used for clinical decision-making.