Hepatocellular carcinoma (HCC) represents over 80% of primary liver malignancies and ranks among the top three causes of cancer mortality in 46 countries. Global estimates for 2022 reported 19.96 million new cancer cases and 9.74 million cancer-related deaths worldwide, with HCC carrying a disproportionate burden across diverse populations. Despite decades of advances in screening, imaging, and multimodal treatment, HCC prognosis remains poor due to persistent gaps in early risk prediction and marked variability in treatment responses.
The problem with traditional pathology: Histopathological evaluation remains the gold standard for liver cancer diagnosis, but it suffers from three critical limitations: reliance on subjective interpretation, time-intensive analytical processes, and substantial interobserver variability. These bottlenecks directly hinder the advancement of precision medicine in HCC management and create an urgent need for computational solutions that can standardize and accelerate diagnostic workflows.
Enter pathomics: Pathomics is an emerging discipline that integrates artificial intelligence (AI) with quantitative pathology image analysis. It works by extracting high-dimensional features from digitized histopathological specimens, specifically whole-slide images (WSIs), using convolutional neural network (CNN)-based architectures. The result is a transition from subjective morphological descriptions to mathematically reproducible assessments. WSI-based AI models can now achieve diagnostic concordance with senior pathologist assessments, predict molecular marker status, identify driver mutations, and assess treatment response.
This review by Peng et al. from the Department of Hepatobiliary Surgery at Wenzhou Medical University systematically describes the progress of AI-driven pathomics across liver cancer diagnosis, individualized treatment, and prognosis. The authors analyze current technological bottlenecks and explore the potential for clinical translation. The paper was published in the World Journal of Clinical Oncology (2025) and includes a comprehensive table of studies spanning diagnosis, recurrence prediction, survival prognostication, liver metastases, image segmentation, and transcriptomic integration.
The pathomics analysis workflow consists of three main steps: region of interest (ROI) selection, colour normalization, and extraction and analysis of pathomics features. After collecting and scanning pathology images, ROIs are labelled either manually by pathologists using tools like QuPath and ASAP, or automatically via tissue classifier algorithms. Manual outlining is accurate and flexible but time-consuming, subjective, and unrepeatable. Automated methods save human resources and improve consistency but may struggle with image quality differences and complex backgrounds.
Colour normalization: Even when the same staining protocol is used, colour differences in WSIs across different laboratories are inevitable due to variations in staining time, concentration and pH of the staining solution, and differences in staining platforms and scanner models. Researchers have proposed various normalization techniques to reduce the effect of these colour variations on trained models, which is essential for algorithm generalizability across institutions.
Feature extraction approaches: Traditional methods rely on hand-crafted feature descriptors including first-order features (shape, size, texture, colour distributions) and second-order features (colour histograms, grayscale co-occurrence matrices). These are commonly fed into machine learning models such as support vector machines (SVMs) and random forests. Deep learning (DL) methods using CNNs have surpassed traditional approaches by automatically learning abstract, high-level feature representations. Recent hybrid methodologies integrating handcrafted features with DL-derived representations have demonstrated superior detection accuracy compared with either approach alone.
Dimensionality reduction and model selection: To address the high dimensionality and redundancy of extracted features, techniques such as principal component analysis (PCA) and linear discriminant analysis (LDA) are applied. Algorithm selection depends on the task: decision trees are preferred when interpretability is paramount, while deep neural networks excel in high-accuracy prediction tasks. The paper also details interpretability tools including Attention Maps, Gradient-weighted Class Activation Mapping (Grad-CAM), and SHapley Additive exPlanations (SHAP), which are critical for opening the "black box" of DL models and enabling clinical trust.
The review covers multiple AI systems designed for the diagnosis and classification of primary liver cancers, including HCC, intrahepatic cholangiocarcinoma (iCCA), and combined hepatocellular-cholangiocarcinoma (cHCC-CCA). HnAIM system: Cheng et al. (2022) developed a deep learning system integrating ResNet50, InceptionV3, and Xception architectures into an ensemble framework. Trained on 738 patients providing 1,115 WSIs from 6 hospitals, HnAIM achieved a 93.5% AUC for hepatocellular lesion classification in external validation, surpassing pathologists with varying experience levels. The system also visualizes lesion distribution proportions and shortens diagnostic time.
LiverNet: Aatresh et al. (2021) developed a lightweight architecture using an atrous spatial pyramid pooling (ASPP) module for HCC grading, achieving 90.93% accuracy with only 0.5739 million parameters. This compact design demonstrates strong translational potential for clinical deployment. In a remarkable result, Liao et al. (2020) built a CNN platform using TCGA and West China Hospital data that achieved a perfect AUC of 1.000 for WSI analysis, and pioneered linking histomorphometric features to somatic mutations.
Subtyping and rare variants: Beaufrere et al. (2024) developed a weakly supervised model to differentiate HCC from iCCA using 166 HE-stained WSIs (90 training, 29 internal validation, 47 external validation) and quantified compositional ratios within cHCC-CCA. Dong et al. (2022) created the FuNet fusion strategy with a channel-spatial attention mechanism for enhanced multimodal feature characterization. Liu et al. (2023) built a Faster RCNN model for diagnosing primary clear cell carcinoma of the liver (PCCCL), a rare HCC subtype, achieving 96.2% diagnostic accuracy with each case processed in just 4 seconds.
AI-assisted decision support: Kiani et al. (2020) demonstrated a critical duality in AI-assisted diagnostics: while accurate model predictions significantly improved diagnostic concordance among pathologists (P = 0.000), erroneous predictions systematically introduced diagnostic bias. This finding underscores the need for robust quality assurance frameworks when deploying AI tools alongside human pathologists, rather than as standalone replacements.
Postoperative recurrence is a critical threat: approximately 50%-70% of HCC patients develop recurrent metastasis within 5 years after radical resection. Early recurrence (within 2 years) is mostly linked to microvascular invasion (MVI) and the immune microenvironment, while late recurrence is associated with neoplastic or cirrhotic progression. MVI, defined as microscopically observed nests of cancer cells in portal vein, hepatic vein, or tumor-enveloped blood vessels, is among the most important predictors of postoperative recurrence.
MVI-AIDM: Zhang et al. (2024) developed the MVI AI Diagnostic Model, which mimics the three-step diagnostic process of a pathologist: localization of the tumour region, segmentation of the microvasculature, and classification of cells. Tested on 753 internal cases and 358 external cases, MVI-AIDM achieved 94.25% accuracy in independent external validation, with a significantly higher positive detection rate for MVI than conventional microscopy (70.85% vs 64.13%). Laurent-Bellue et al. (2024) used a ResNet34 algorithm trained on 107 internal cases (680 WSIs) and validated on 29 external cases to automatically quantify invasive structures including MVI, poorly differentiated tumour regions, and peritumoral vascular encasement.
MVI-DL: Chen et al. (2022) developed a weakly supervised multi-example learning framework to assess MVI status using only tumour tissue slices. The model achieved AUC values of 0.904 (internal, 350 cases with 2,917 WSIs) and 0.871 (external, 120 cases with 504 WSIs) and maintained good performance even with a single slice or biopsy sample. Visual analysis linked positive MVI to vascular sinus-rich giant trabecular structures and intratumor heterogeneity, while negative MVI correlated with severe immune infiltration and highly differentiated tumour cells. Feng et al. (2021) introduced an annotation noise-optimized DL framework achieving 87.81% pixel-level segmentation accuracy and 98.77% slide-level diagnostic accuracy on an internal test set, with 87.90% accuracy on 157 external TCGA patients.
Immune cell dynamics: Qu et al. (2023) developed the Deep Pathomics Score (DPS) using ResNet-50 and a modified DeepSurv network to predict recurrence after liver transplantation in 380 HCC patients. The DPS achieved a C-index of 0.827 in training and 0.794 in validation. Immune cells were the most significant histologic category for predicting recurrence. Cellular-level analysis revealed that greater natural killer (NK) cell infiltration within the tumour correlated with lower recurrence risk, establishing a direct link between the tumour microenvironment and clinical outcomes.
Accurate survival prediction in HCC and iCCA is urgently needed given the high heterogeneity and substantial prognostic differences across patients. Several studies have explored pathomics for prognostication. Zhou et al. (2024) developed a pathomics model using HE-stained images from 267 TCGA HCC cases to predict EZH2 (Enhancer of Zeste Homolog 2) expression levels. They demonstrated that increased pathomics scores were independently correlated with poorer overall survival, suggesting that morphological patterns in routine HE sections can serve as histomorphometric proxies for critical oncogenic alterations.
TIL quantification: Jia et al. (2023) built a ResNet 101V2-based model to quantify tumour-infiltrating lymphocytes (TILs) with high predictive accuracy (AUC > 0.95) using 100 WSIs from Xijing Hospital and cross-validated against the TCGA database. TIL infiltration levels were identified as an independent prognostic factor, leading to a prognostic nomogram integrating immune-specific features. CHOWDER model: Saillard et al. (2020) proposed a DL model that requires no manual annotation and predicts HCC survival with a C-index of 0.75-0.78, significantly outperforming traditional clinical metrics. CHOWDER was validated on a discovery set of 194 cases and a validation set of 328 cases, identifying vascular luminal space and immune infiltration defects as morphological markers of poor prognosis.
Interpretable frameworks for iCCA: Ding et al. (2024) constructed an interpretable DL framework for iCCA using 373 development cases and 381 validation cases (213 internal plus 168 external). They identified the distribution of tertiary lymphoid structures, tumour-mesenchymal ratios, and nuclear morphology (distorted nuclear membranes, low texture contrast) as key prognostic markers, validated through multiomics against glycolysis and immune infiltration pathways. Xie et al. (2022) established an immunohistochemistry (IHC)-free prognostic framework for iCCA (AUC = 0.68) based on tumour architectural complexity and lymphocyte spatial topology in 127 cases.
Tumour Risk Score (TRS): Shi et al. (2021) pioneered a weakly supervised DL framework deriving a TRS from 1,125 Zhongshan Hospital cases (2,451 WSIs) with external validation on 320 TCGA cases. The TRS demonstrated robust correlations with hepatic sinusoid capillarization and nucleolar atypia. Multiomics analyses linked increased TRS to FAT3 mutations and immune evasion pathways, providing a paradigm for "image-gene-clinical" triple analysis that connects histomorphology to molecular mechanisms.
The liver serves as a predominant metastatic niche for colorectal, pancreatic, breast, and gastric cancers due to its dual blood supply, fenestrated sinusoidal endothelial architecture, and immunologically tolerogenic microenvironment. Liver metastases are more common than primary liver tumours and carry a particularly poor prognosis, with approximately 30%-70% of cancer patients dying from liver metastases. Existing treatments including systemic chemotherapy, radiotherapy, immunotherapy, and targeted therapy have limited efficacy for metastatic disease.
Primary site identification: Chen et al. (2024) pioneered a hybrid approach integrating handcrafted pathomics features (nuclear morphology, cytoplasmic texture) with DL-derived spatial patterns across 114 patients (175 WSIs), achieving moderate discriminative performance (AUC: 0.64-0.83) for classifying the primary origins of metastases. HEPNET: Albrecht et al. (2023) achieved an exceptional AUC of 0.997 in differentiating iCCA from colorectal liver metastases using 456 training cases, 115 internal test cases, and 159 external validation cases. This surpasses even experienced pathologists and significantly reduces reliance on costly immunohistochemistry testing. Jang et al. (2023) developed a triple-classification framework differentiating HCC, CCA, and colorectal cancer metastases with AUC > 0.995.
Growth pattern classification: Hoppener et al. (2024) developed a neural image compression (NIC) algorithm for automated differentiation between desmoplastic and non-desmoplastic colorectal liver metastases, achieving AUC values of 0.93-0.95 across 932 development cases (3,641 images) and 870 external validation images. The model demonstrated prognostic predictive power beyond mere morphological classification, suggesting deep biological correlations between stromal remodelling patterns and tumour aggressiveness.
Spatial and predictive analysis: Qi et al. (2023) developed the CRLM-SPA framework that quantifies 17 microenvironmental characteristics including tumour necrosis ratio and spatial distribution of lymphocyte infiltration. Their spatial organization feature (SOF) model amplified the 5-year survival stratification difference to 22.5%, with significant complementary prognostic value alongside the clinical risk score (AUROC improvement, P = 0.004). Xiao et al. (2022) constructed a DL nomogram combining ResNet-50 with pT/pN staging to predict liver metastasis risk from primary colorectal cancer (C-index = 0.81), with dynamic predictive ability for 1-3 year metastasis risk (AUC = 0.84) across 611 cases.
Image segmentation is a foundational component of computational pathology, enabling precise delineation of anatomical structures and pathological features for surgical planning, three-dimensional tumour reconstruction, and volumetric assessment. WSI technology generates massive data volumes, typically 1-3 GB per slide, creating a dual challenge of computational scale and the need to automate traditionally labour-intensive manual analysis protocols.
Nuclear segmentation: Lal et al. (2021) developed NucleiSegNet, incorporating residual blocks and attention decoders for efficient segmentation of morphologically complex and adherent nuclei in HE-stained liver cancer images, validated on 80 KMC liver dataset images. Rong et al. (2023) proposed the HD-Yolo algorithm, which optimizes detection workflows and substantially accelerates nuclear segmentation while enhancing tumour microenvironment (TME) characterization. Gu et al. (2025) developed the CSGO framework for whole-cell segmentation using nuclear membrane segmentation plus post-processing algorithms. Trained on 18 cases and validated across 5 external datasets, CSGO outperformed CellPose and supports comprehensive TME whole-cell morphology analysis.
Addressing data scarcity: Jehanzaib et al. (2025) proposed the PathoSeg model coupled with PathopixGAN, which mitigates class imbalance through synthetic data generation. Their architecture combines a modified HRNet with CBAM attention mechanisms, validated on 82 full slices from 62 patients across liver, prostate, and breast cancer. Hagele et al. (2024) pioneered a complementary label strategy to reduce reliance on pixel-level annotations, achieving balanced accuracy of 0.91 in HCC and iCCA segmentation using 165 examples. Chen et al. (2022) demonstrated the efficacy of SENet models, achieving 95.27% accuracy in liver cancer differentiation grading from 444 pathology images.
Clinical integration frameworks: Roy et al. (2021) developed HistoCAE with autoencoder reconstruction strategies for liver tumour segmentation, and their multiresolution MR-HistoCAE extension outperformed conventional segmentation networks. Khened et al. (2021) created DigiPathAI, an open-source framework featuring multimodel ensemble and uncertainty estimation that achieves state-of-the-art performance across breast, colon, and liver cancer WSI analysis. The DigiPathAI pipeline enables complete workflow support from segmentation to tumour burden calculation, representing significant clinical translation potential.
The molecular heterogeneity and complex microenvironmental features of HCC pose a serious challenge for precision diagnosis and treatment. The integration of pathomics with transcriptomic data has opened a new dimension for analysing tumour biology, establishing interpretable associations between histomorphology and molecular phenotypes.
Calderaro et al. (2023): This landmark study developed a DL framework achieving precise reclassification of combined HCC-cholangiocarcinoma (cHCC-CCA) in 405 patients. A self-supervised pretrained ResNet50 feature extractor was synergistically integrated with an attention-based multiple instance learning (MIL) framework. The system achieved AUROC values of 0.99 (internal) and 0.94 (external). Spatial transcriptomics validated the molecular underpinnings: iCCA-dominant subtypes showed increased expression of biliary differentiation markers (KRT19, EPCAM), while HCC-dominant regions were enriched with hepatocytic markers (ALB, APOA2), spatially resolving the biological essence of tumour heterogeneity at the molecular level.
Immune gene signature prediction: Zeng et al. (2022) used a Clustering-constrained Attention Multiple Instance Learning (CLAM) architecture for multiscale analysis of HCC histopathological images, predicting immune gene activation status with validation AUCs of 0.81-0.92. Integrative digital pathology-transcriptomics analysis revealed that immune "hotspots" identified by the model correlated significantly with lymphocyte and plasma cell infiltration, establishing a visual decision-support framework for developing biomarkers targeting PD-1/PD-L1 inhibitors and other immunotherapies.
Together, these studies outline a technical blueprint for pathomics-transcriptomics integration. Calderaro et al. established a gold standard for morphology-molecular correlation through spatial multiomics validation, while Zeng et al. pioneered therapy sensitivity prediction. Future research should incorporate single-cell sequencing and dynamic radiomics to develop interpretable cross-scale predictive models that close the loop from histomorphological diagnosis to molecularly guided therapeutic intervention.
Regulatory barriers: Despite landmark advancements such as the FDA approval of Paige Prostate as the first AI-assisted diagnostic tool, the approval process for AI medical devices in pathology remains complex and time-consuming, lagging far behind other fields like radiology. Laboratory-developed tests (LDTs) and in-house devices face increasingly stringent regulatory requirements under both FDA proposed regulations and the European Union's In Vitro Diagnostic Medical Devices Regulation framework. Disparities in regulatory frameworks across countries pose additional challenges to standardization and transnational deployment.
Data interoperability and infrastructure: Seamlessly embedding AI tools into clinical workflows, particularly achieving interoperability with laboratory information systems (LIS) and picture archiving and communication systems (PACS), remains a critical bottleneck. The lack of unified standards for pathological data formats, annotations, and model outputs impedes data sharing and model generalizability. The high infrastructure costs for digital pathology and AI integration, including scanners, storage, and computational resources, severely limit adoption in resource-constrained regions, particularly parts of Asia.
Prospective validation gap: Although many models demonstrate excellent performance in retrospective studies, their clinical value and safety must be confirmed through large-scale, multicenter prospective trials. Most AI tools currently lack the highest levels of clinical evidence. The review highlights three persistent translational challenges across all application areas: intrinsic sample heterogeneity requiring massive annotated datasets (e.g., one seven-class classification model demanded 204,159 precisely labelled tissue regions), staining protocol-dependent performance degradation limiting algorithm generalization, and most systems lacking interoperability with hospital infrastructure.
Future directions: The authors advocate for several priorities. First, discovering HCC-specific biomarkers and enhancing model interpretability through multi-omics data integration. Second, developing AI-enhanced multimodal fusion combining radiogenomics and liquid biopsy to capture multidimensional tumour dynamics. Third, establishing interdisciplinary collaboration frameworks to develop standardized HCC pathomics data protocols, cost-effective regionalized solutions, and clinical practice guidelines based on real-world evidence. Emerging technologies such as slide-free imaging systems and light sheet microscopy signal potential future pathways for the field.
The interpretability imperative: A fundamental tension exists between the "black-box" nature of deep learning and clinical interpretability requirements. While tools like Attention Maps, Grad-CAM, and SHAP have opened the black box to some degree, the responsible deployment of DL technologies must incorporate interpretability methods as a core component. The review emphasizes that clinical trust requires not only accuracy but transparency in the decision-making process, ensuring that AI-assisted pathology serves the goal of improving patient care rather than introducing opaque automation.