The scale of the problem: Colorectal carcinoma is one of the most prevalent cancers worldwide, carrying significant morbidity and mortality rates. The disease typically develops from adenomatous polyps that progress through multiple stages, creating a substantial global health burden. Histopathology, the microscopic examination of tissue samples from biopsies or surgical resections, remains the cornerstone for diagnosing and staging the disease. Pathologists evaluate morphological features such as tumor differentiation, invasion depth, lymphovascular invasion, and tumor budding to guide treatment decisions and predict patient outcomes.
Limitations of the traditional approach: Despite the central role of colonoscopy and histopathology, traditional methods face meaningful challenges. Colonoscopy can be inadequate in the proximal colon, impairing its ability to detect neoplasia. Screening programs are burdened by elevated costs, including expenses for biological therapy in advanced cases, and population-based initiatives suffer from persistently low compliance rates. Uncertainties around the specificity and reproducibility of fecal tests and endoscopic procedures further compromise the reliability of conventional colorectal cancer screening.
Accurate diagnosis as the foundation: Precise diagnosis directly determines treatment selection, assessment of treatment responsiveness, evaluation of toxicity, survival estimation, and overall patient well-being. Diagnostic modalities such as MRI and bone scans help identify conditions, but the histopathological assessment remains the gold standard for staging and planning therapy. Errors or delays in diagnosis can lead to suboptimal treatment, reinforcing the need for tools that augment human capability.
AI as an augmentation tool: Artificial intelligence encompasses technologies that enable computers to perform tasks requiring human-level reasoning, such as learning and problem-solving. In healthcare, AI algorithms can analyze large volumes of medical data, extract meaningful insights, and assist clinicians in making more accurate and timely decisions. The authors position this review as an exploration of how AI techniques integrate with traditional histopathological methods for diagnosing, prognosticating, and managing colorectal cancer.
How histopathology works: The process begins with preparing tissue sections on glass slides, followed by staining with dyes such as hematoxylin and eosin (H&E) to enhance visibility. Skilled pathologists then meticulously scrutinize cell structures and groupings under a microscope, identifying cellular changes that signify different conditions. This examination is pivotal in diagnosing a spectrum of ailments, from cancer and infectious diseases to inflammatory conditions and autoimmune disorders. In the context of cancer, histopathology differentiates benign from malignant tumors and assesses treatment efficacy.
Molecular advances in traditional pathology: Techniques such as fluorescence in-situ hybridization (FISH) and polymerase chain reaction (PCR) have expanded the pathologist's toolkit by enabling the mapping of genetic material in tissues. These molecular methods provide insights into the molecular mechanisms underlying diseases, complementing the morphological assessment. However, they add complexity, cost, and turnaround time, which can strain already overburdened pathology laboratories.
Early AI in histopathology: The relationship between AI and pathology stretches back to the 1970s, when the MYCIN expert system was developed to assist in diagnosing infectious diseases. This pioneering application marked the genesis of AI-driven approaches in disease diagnosis. Building on those foundational principles, contemporary efforts have leveraged machine learning and deep learning techniques to analyze histological images with unprecedented precision, propelling histopathology into a new era of enhanced diagnostic accuracy.
Evolution of image analysis: Early image processing relied on conventional methods such as image enhancement, segmentation, and manual feature extraction. With the advent of machine learning, capabilities expanded to include automated sorting, scene comprehension, semantic segmentation, and applications such as facial recognition and autonomous driving. In medical imaging specifically, AI now powers anomaly detection, super-resolution, noise reduction, and color correction, all of which are directly applicable to the analysis of histopathological slides.
Supervised learning classifiers: Machine learning algorithms play a pivotal role in image classification for colorectal carcinoma. Support vector machines (SVMs) are adept at segregating data into distinct classes and handle high-dimensional data well, making them suitable for histopathological image classification. K-nearest neighbors (KNN) determines the class of an unlabeled data point based on its closest neighbors, making it robust for small to medium-sized datasets. Both methods rely on labeled training data to classify images into predefined categories with high accuracy.
Convolutional neural networks (CNNs): CNNs are specialized deep learning architectures inspired by the hierarchical organization of the visual cortex. They automatically extract features through convolutional layers, eliminating the need for handcrafted feature engineering. Their architecture, composed of convolution layers, pooling layers, and fully connected layers, enables progressive hierarchical feature extraction. CNNs require substantial data due to their millions of learnable parameters, typically demanding GPU-based training. However, their ability to discern intricate patterns makes them the dominant architecture for histopathological image analysis.
Deep belief networks and ensemble approaches: Deep belief networks (DBNs) leverage unsupervised learning to pre-train multiple layers before fine-tuning with labeled data, producing strong results in image classification. Ensembles of CNNs have been proposed to enhance feature extraction capabilities beyond what individual networks or traditional ensemble techniques can achieve, particularly for complex datasets where capturing subtle morphological nuances is essential for accurate analysis.
Integration with traditional techniques: Rather than replacing pathologists, AI is designed to augment their expertise. By combining AI technologies with established methods like whole slide imaging, pathologists can leverage computational capabilities to analyze vast datasets and extract insights that surpass human visual perception. This integration enables detection of subtle patterns, disease classification, and biomarker assessment with enhanced reproducibility and scalability. The synergy streamlines diagnostic workflows, reducing the time and effort required for analysis while opening new avenues for research and drug development.
Improved diagnostic accuracy: Deep-learning-based methods such as Bag of Words and PAHLI have played a pivotal role in augmenting diagnostic accuracy for colorectal cancer detection. These AI systems have demonstrated practicality in initial testing, autonomously extracting intricate features from images or videos with improved speed and specificity compared to conventional methods. Notably, AI has proven effective in predicting microsatellite instability (MSI), a crucial factor for stratifying patients for targeted immunotherapies. Despite these advances, the authors note that more diverse patient data is needed to further refine the sensitivity and specificity of neural models.
Prognosis prediction from histopathological features: AI algorithms have shown remarkable capabilities in forecasting patient prognosis and response to specific therapies by analyzing morphological characteristics from histological images. These tools excel at stratifying patients with varying disease grades, sometimes surpassing the abilities of human pathologists. By incorporating tumor grade, subtype, microenvironment patterns, and genetic profiles, AI algorithms establish connections between pathology images, survival outcomes, and treatment responses, enabling a more nuanced approach to precision medicine.
Biomarker and therapeutic target identification: AI-driven platforms like PandaOmics harness artificial intelligence and bioinformatics to scrutinize omics and biomedical data, generating novel hypotheses about therapeutic targets and biomarkers. Targets identified by these platforms undergo validation via both in vitro and in vivo studies. As part of Insilico Medicine's Pharma.ai suite, PandaOmics exemplifies AI's role in swiftly identifying molecular targets and biomarkers for various diseases, accelerating the development of precision medicine solutions.
Predicting immunohistochemical markers: AI's capacity to predict immunohistochemical marker expression from H&E findings is particularly valuable for diagnostic purposes, especially in cases with borderline morphology where traditional methods may falter. The use of AI in biomarker discovery is anticipated to refine patient stratification, enhance treatment efficacy, and optimize clinical trial outcomes by facilitating more personalized and targeted therapies grounded in specific biomarkers identified through advanced computational technologies.
Risk assessment and early intervention: By analyzing a wide array of patient data, AI algorithms can uncover disease predispositions, forecast disease progression, and suggest preventive actions. This insight enables healthcare professionals to undertake earlier interventions and conduct targeted screenings, significantly improving outcomes. AI's ability to identify subsets of patients likely to exhibit positive or adverse reactions to specific treatments aids clinicians in making more informed decisions about treatment pathways.
Customized treatment plans: AI facilitates the customization of treatment by analyzing patient-specific data, including genetics, biomarkers, comorbidities, and responses to past treatments. This allows healthcare providers to design therapies closely aligned with individual patient characteristics, optimizing therapeutic results. AI considers factors such as patient preferences, resource distribution, cost-effectiveness, and adherence to clinical guidelines to evaluate treatment options and recommend the most suitable course of action.
Real-time monitoring and predictive analytics: AI-powered monitoring systems track patient responses to treatments in real time by analyzing data from wearable devices, electronic health records, and patient-reported outcomes. These insights enable timely interventions or adjustments to the treatment plan when necessary. AI identifies specific biomarkers, genetic variations, or molecular signatures linked to diseases, enabling development of personalized treatments that are more effective and carry reduced side effects.
Watson for Oncology in practice: In colorectal cancer treatment specifically, AI applications like Watson for Oncology (evaluated in a 250-case cohort) offer personalized, evidence-based clinical treatment strategies. Additionally, AI-assisted clinical tools for decision support have been developed for predicting future endoscopic surveillance intervals, exemplifying AI's capacity to enhance treatment planning and long-term patient monitoring throughout the CRC care continuum.
Data quality versus quantity: High-quality data lays the groundwork for AI systems to function accurately, while sufficient quantity provides the volume needed for comprehensive training and validation. Achieving equilibrium between the two is paramount. Training data must mirror the diversity of real-world scenarios through meticulous evaluation, curation, and ongoing quality assessments. Professional practices in data acquisition, cleaning, labeling, and annotation are essential for securing high-quality training datasets, yet only a limited number of publicly available labeled datasets exist for CRC AI development.
The interpretability problem: Interpretability focuses on comprehending how an AI algorithm works internally, exploring its parameters, functions, and interconnections. Transparency pertains to the openness of design, development, and operational phases of AI systems, ensuring the decision-making frameworks are accessible and comprehensible to stakeholders. Various methodologies aim to demystify models, including inherently interpretable architectures and empirical validations. However, "black-box" deep learning models remain difficult to interpret, creating fundamental challenges for clinical trust and regulatory approval.
Integration with healthcare systems: For AI to effectively enhance healthcare services, systems must navigate regulatory approvals, integrate harmoniously with Electronic Health Record (EHR) platforms, and adhere to rigorous standards. This process demands significant investment in resources, infrastructure, and expertise, along with support from clinical stakeholders at all levels. Collaboration between healthcare entities, AI technologists, and regulatory authorities is vital for laying down the foundational guidelines governing AI deployment in clinical environments.
Ethical and regulatory considerations: Concerns over privacy, potential biases, discrimination, and patient safety remain paramount. The literature emphasizes the need for comprehensive regulatory frameworks specifically designed to oversee AI in clinical settings, maintaining ethical integrity and safeguarding patient rights. Ensuring that AI applications contribute positively to care without compromising ethical standards requires ongoing attention to data privacy, algorithmic bias, and the maintenance of essential human oversight throughout the diagnostic process.
Algorithm refinement priorities: The authors stress the need for more controlled datasets during early deployment phases of AI/ML algorithms, with iterative refinements as tools evolve. Shifting focus toward gland segmentation models rather than relying solely on machine learning classifiers can significantly enhance accuracy and efficiency. Addressing false positives through continuous training and validation of composite AI models is a vital strategy for improving diagnostic precision in histopathological screening of colorectal cancer. Expanding datasets to include larger sample sizes and executing multi-site clinical validations are crucial for ensuring algorithmic robustness and generalizability.
Integration with other diagnostic modalities: Fusing AI with various diagnostic tools, including X-ray, CT scans, MRI, and PET scans, provides a more integrated perspective on patient health. This multimodal approach enhances the accuracy of diagnoses and the customization of treatment protocols. AI's ability to handle and analyze extensive datasets unveils patterns and insights that might elude manual detection, elevating both diagnostic accuracy and efficiency across the entire clinical workflow.
Adoption in clinical decision-making: Machine learning and deep learning technologies show promise in forecasting and categorizing diagnoses, offering recommendations, and enhancing practitioner efficacy across diverse medical domains including cancer risk assessment. The authors highlight that AI adoption can elevate the caliber, efficiency, and efficacy of healthcare services, ultimately producing superior patient outcomes and heightened user satisfaction. Understanding clinician perception, expectancy, and perceived risk is essential for shaping willingness to use AI-driven decision support systems.
Impact on patient outcomes: AI-driven predictive analytics hold the potential to significantly enhance the precision, efficiency, and cost-effectiveness of disease diagnosis and clinical laboratory testing. By continuously analyzing variables such as population demographics and disease prevalence, healthcare systems can identify individuals at elevated risk of specific conditions and facilitate proactive prevention. Technologies such as natural language processing (NLP) can furnish practitioners with real-time, accurate information, streamlining tasks and alleviating physician stress. The authors envision AI as a transformative force that will reshape CRC care from screening through survivorship.