Application of Artificial Intelligence in Diagnosis and Treatment of Colorectal Cancer: A Novel Prospect

Frontiers in Medicine 2023 AI 9 Explanations View Original
Original Paper (PDF)

Unable to display PDF. Download it here or view on PMC.

Plain-English Explanations
Page 1
Why CRC Is a Global Health Priority and Where AI Enters

Colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide. According to GLOBOCAN 2020, there were an estimated 1.93 million new CRC cases globally that year. Its incidence is particularly alarming in countries undergoing rapid social and economic transition. In China alone, approximately 560,000 newly diagnosed CRC cases were reported in 2020, placing it second only to lung cancer. Projections based on World Health Organization population data estimated this number would rise to 590,000 cases in China by 2022, the highest of any country in the world.

Risk factors for CRC extend beyond lifestyle elements such as smoking, obesity, and unhealthy diet to include gender, genetic predisposition, and family history. Current diagnostic methods rely on laboratory tests, endoscopy, imaging (CT, MRI), and histopathological examination, while treatment involves surgery, radiotherapy, and post-metastasis therapies. Despite these established tools, the continued rise in CRC incidence and mortality underscores the need for improved approaches.

Artificial intelligence (AI) has emerged as a transformative technology in this space. Machine learning (ML) is a core subbranch of AI, encompassing methods such as deep learning (DL), support vector machines (SVM), random forests (RF), and convolutional neural networks (CNN). Among these, DL and CNN have been the most successful algorithms applied in medicine, playing roles in data management, diagnosis prediction, and drug delivery. On the physical side, robotic systems such as the da Vinci surgical system (FDA-approved in 1999, now in its fourth generation) have become widely used in CRC surgery.

TL;DR: CRC affects 1.93 million people annually worldwide and is rising in developing nations. AI methods including CNN, deep learning, SVM, and robotic surgical systems like da Vinci are being applied across the entire CRC care pathway from screening through treatment.
Pages 1-3
CADe Systems for Polyp and Adenoma Detection During Colonoscopy

Colonoscopy is the gold standard for diagnosing colorectal diseases and is strongly recommended for early screening by national associations. However, due to high operator variability, challenging bowel preparation, and other factors, polyp and adenoma detection rates vary substantially between endoscopists. The adenoma detection rate (ADR) is a key quality indicator: a higher ADR correlates directly with lower CRC incidence and mortality. The adenoma miss rate (AMR) measures the difference between lesions detected by consecutive endoscopies.

Computer-aided detection (CADe) systems powered primarily by deep learning algorithms have been developed to improve detection speed and accuracy. A YOLOv3-based algorithm was used for real-time polyp detection via video, achieving fast and economical results suitable for large-scale deployment in underdeveloped areas. In a randomized controlled study (522 patients), a real-time automatic detection system raised the ADR to 29.1% and polyp detection rate (PDR) to 64.93%. Another study using a CNN-based CADe system across 1,434 patients achieved a PDR of 40.8% and ADR of 20.1%.

Kamba et al. established a CNN-based CADe system to reduce the adenoma miss rate. Their multicenter randomized controlled trial (358 patients) demonstrated an AMR of 13.8% in the CADe-assisted group, which was significantly lower than the standard colonoscopy group, with only a 10.9% difference in ADR between groups. A separate prospective tandem study (386 patients) using deep learning reported an AMR of 13.89%. Another trial (230 subjects) reported an AMR of 15.5% with CNN assistance.

Beyond simple detection, AI has been combined with advanced endoscopic tools such as narrow-band imaging (NBI), magnifying endoscopy, confocal laser endomicroscopy, and chromoendoscopy. Researchers using probe-based confocal laser endoscopy developed CADe algorithms to distinguish tumor from non-tumor polyps, achieving sensitivity, specificity, and accuracy all exceeding 90%. These optical biopsy techniques can capture real-time images that, combined with AI, help reduce misdiagnosis and overtreatment.

TL;DR: CADe systems using CNN and YOLO architectures significantly improve colonoscopy outcomes. Across multiple trials, AI reduced the adenoma miss rate to 13.8-15.5% and boosted ADR to 29.1% (522 patients). Confocal laser AI achieved over 90% sensitivity and specificity for polyp classification.
Pages 3-4
AI-Enhanced Blood Tests, Biomarkers, and Genetic Screening

Non-invasive CRC screening involves detecting tumor markers from blood, feces, and other samples. Traditional methods such as the fecal occult blood test (FOBT) and carcinoembryonic antigen (CEA) suffer from low sensitivity and specificity. Machine learning offers a path to substantially improve existing biomarker performance. Li et al. extracted common markers from laboratory blood tests and tested five ML models (SVM, logistic regression, RF, k-nearest neighbors, and naive Bayes) across 1,164 electronic medical records, finding that the logistic regression model greatly improved CEA sensitivity and specificity for CRC identification.

For genetic mutation detection, Zhang et al. used near-infrared spectroscopy combined with a counter-propagation artificial neural network (CP-ANN) to detect the BRAF V600E mutation in 312 tissue samples, achieving 100% sensitivity, 87.5% specificity, and 93.8% overall accuracy. DNA methylation biomarkers represent another promising avenue: Kel et al. analyzed data from 300 CRC patients using ML and bioinformatics methods, selecting six DNA methylation epigenetic biomarkers with optimal cancer detection potential.

Machine learning applied to whole-genome sequencing of plasma cell-free DNA from 817 plasma samples achieved an AUC of 0.849 for CRC detection, while a separate ML-based circulating DNA analysis across 289 healthy individuals and 983 patients reached a specificity of 0.89 and sensitivity of 0.72. Bioinformatics analysis of gene expression microarray data identified 105 differentially expressed genes and 10 hub genes, all with AUC values exceeding 0.92, confirming their potential as CRC biomarkers.

Further genomic research has explored non-coding RNAs (ncRNAs) including microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) as biomarkers. Bioinformatics analysis confirmed that miR-31 was significantly increased in plasma and tissue of CRC patients with lymph node metastasis, predicting TNS1 as a targeted protein with prognostic value. Additionally, a LASSO-based model using 480 CRC and 41 normal tissues predicted survival with AUC values of 0.69-0.73 (3-year and 5-year).

TL;DR: AI-enhanced blood tests and biomarker screening outperform traditional methods. CP-ANN detected the BRAF V600E mutation with 100% sensitivity and 87.5% specificity (312 samples). ML models identified hub genes with AUC above 0.92, and plasma cell-free DNA analysis achieved an AUC of 0.849 for CRC detection.
Pages 4-5
AI-Assisted Pathology: CNN-Based Image Classification and Survival Prediction

Pathology remains the gold standard for tumor diagnosis, identifying cell types, staging tumors, guiding treatment, and predicting prognosis. With the rise of digital pathology (DP), AI-powered computer-aided diagnostic systems can perform image retrieval, pattern recognition, and automatic identification of regions of interest (ROI). In a landmark study, researchers trained a CNN with transfer learning (TL) on over 100,000 H&E image patches from 86 CRC tissue slides, achieving a nine-class accuracy exceeding 94% across 7,180 independent data sets from 25 CRC patients, confirming CNN's ability to separate histological images and predict survival rates.

Wang et al. proposed a CNN-based method to classify large volumes of histopathological images, reaching an AUC of 0.988 for distinguishing CRC from benign tissues. Another study using CNN combined with recurrent neural networks (RNN) on 4,036 whole-slide images (WSIs) achieved an AUC of 0.96 for adenocarcinoma and 0.99 for adenoma classification. A deep learning screening algorithm applied to 294 WSIs reported an AUC of 0.917 with 97.4% sensitivity. The HCCANet model, a CNN using multichannel fusion attention, achieved 87.3% overall accuracy and an average AUC of 0.9 for image grading across 630 images.

For histopathologic segmentation, a CNN with transfer learning model applied to 25 WSIs achieved a Dice Similarity Index of 82.74% and accuracy of 87.07%. An ANN/SVM approach on 5,000 histopathology image tiles reached 95.3% performance accuracy. To address the challenge of massive data labeling in supervised learning, researchers proposed semi-supervised learning (SSL) based on the mean teacher structure, demonstrating that SSL significantly reduces the amount of labeled data needed while achieving comparable performance.

A combined nomogram model integrating ML-pathomics, immunoscore, radiomics, and clinical factors was shown to effectively predict postoperative prognosis for CRC patients with lung metastasis, indicating that AI-assisted pathology is moving beyond primary screening toward treatment guidance and prognosis prediction.

TL;DR: CNN-based pathology models achieve remarkable results: AUC of 0.988 for tissue classification, 94%+ accuracy for nine-class histological separation, and AUC of 0.96-0.99 for WSI-based adenocarcinoma/adenoma classification. Semi-supervised learning reduces labeling burdens while maintaining performance.
Pages 5-6
Radiomics for Tumor Segmentation, Staging, and Metastasis Prediction

Radiomics converts medical images into high-dimensional quantitative data for cancer diagnosis and prognosis. While conventional imaging evaluation (MRI, CT, ultrasonography) has limitations such as low tumor staging accuracy and excessive reliance on individual physician expertise, AI-enhanced radiomics extracts information from imaging data for tumor segmentation, feature extraction, and quantitative evaluation.

Tumor segmentation: Liu et al. proposed a label assignment generative adversarial network (LAGAN) for accurate ROI segmentation in CT-based CRC diagnosis, achieving DSC values of 90.82% (FCN32) and 91.54% (U-Net). Hamabe et al. developed a U-Net deep neural network for MRI-based rectal cancer segmentation across 201 preoperative patients, obtaining DSC values of 0.930 (rectum), 0.917 (mesorectum), and 0.727 (tumor) compared to manual segmentation. For diffusion-weighted MRI from 300 locally advanced rectal cancer (LARC) patients, U-Net achieved a mean DSC of 0.675 and median DSC of 0.702.

Detection and metastasis prediction: A deep learning-based lesion detection algorithm (DLLD) applied to 4,386 CT images from 502 patients achieved 81.82% sensitivity for detecting hepatic metastases. A ResNet-based model predicted colorectal liver metastasis (CRLM) response to chemotherapy with an AUC of 0.903 (192 patients). For circumferential resection margin evaluation, a Faster R-CNN model analyzing 12,258 T2-weighted images achieved 93.2% accuracy, 83.8% sensitivity, and 95.6% specificity. Formal methods combined with CT monitoring for liver metastasis reached 100% precision and 93.3% overall accuracy (30 patients).

Treatment response prediction: Deep learning applied to T2W MRI predicted neoadjuvant chemoradiotherapy response with an AUC of 0.99 (383 participants). MRI-based radiomics for preoperative assessment of rectal cancer pathological features achieved AUC of 0.809 (MLP) and 0.746 (RF) with sensitivities of 76.2% and 79.3% respectively (152 patients). Optical coherence tomography (OCT) differentiation using ResNet reached an AUC of 0.975, and real-time OCT diagnosis using deep learning across 26,000 images achieved an AUC of 0.998.

TL;DR: AI-powered radiomics achieves strong results across CRC imaging tasks: U-Net tumor segmentation (DSC 0.727-0.930), Faster R-CNN margin evaluation (93.2% accuracy), ResNet liver metastasis prediction (AUC 0.903), and deep learning treatment response prediction (AUC 0.99 on MRI).
Pages 6-7
Robotic Surgery and AI-Assisted Surgical Workflow Recognition

AI applications in CRC surgery encompass both computer vision (CV) for surgical video analysis and robotic surgical systems. Japanese researchers collected and analyzed 300 laparoscopic colorectal surgery videos and trained a CNN that achieved 81.0% accuracy for surgical phase recognition and 83.2% for action recognition. Another CNN-based deep learning approach applied to 71 laparoscopic sigmoidectomy videos achieved 91.9% accuracy for real-time automatic surgical phase recognition.

The da Vinci robotic system, now in its fourth generation, is the most widely used surgical robot. A longitudinal prospective cohort study of 206 robot-assisted colorectal surgery (RACRS) patients reported radical margin rates of 99.3% (colon) and 89.6% (rectal), with average lymph node yields of 16 nodes and locoregional recurrence rates of 3.8% (colon) and 9.5% (rectal). The fourth-generation system offers improved cantilever design, subtler vision, and more precise operation, while the da Vinci SP system (single-port robot) has been validated for transanal total mesorectal excision (taTME).

The Senhance digital laparoscopy system was evaluated across 55 patients, demonstrating feasibility for colorectal procedures including ileocecal resection (32.7%), high anterior resection (20%), and D3 dissection (74.5%). Igaki et al. developed a flat image navigation system to help surgeons identify anatomical tissue during TME, achieving a Dice coefficient of 0.84 across 600 images from 32 videos. South Korean researchers analyzed 10,000 indocyanine green (ICG) curves from 50 patients, classifying them into 25 curve patterns to create an AI-based real-time microcirculation analysis system for laparoscopic surgery.

Compared with open and laparoscopic surgery, robotic surgery offers advantages including shorter hospital stays, less perioperative bleeding, fewer complications, and improved postoperative quality of life. Long-term recurrence and mortality rates are comparable to laparoscopic surgery. However, high cost remains the primary barrier to wider adoption, requiring government financial support and market standardization to reduce expenses.

TL;DR: AI achieves 91.9% accuracy in real-time surgical phase recognition. The da Vinci system delivers 99.3% radical margin rates in colon surgery (206 patients). Robotic surgery offers shorter stays and fewer complications, though high cost remains the main barrier to adoption.
Pages 7-8
AI for Neoadjuvant Therapy Decisions and Treatment Response Prediction

Neoadjuvant chemoradiotherapy (NCRT) is critically important for CRC treatment, particularly for rectal cancer patients. AI-based clinical decision support systems (CDSSs) have been developed to improve treatment decisions and efficacy evaluation. South Korean researchers created the first CDSS in the country reflecting real chemotherapy data, achieving satisfactory accuracy (AUC above 0.95). Kleppe et al. developed the DoMore-v1-CRC marker, a DL-based system that creates a new risk classification for post-colectomy patients, exempting low-risk patients from unnecessary NCRT and significantly improving their survival rates.

The RAPIDS (Radiopathomics Integrated Prediction System) studied 933 patients and predicted pathological complete response (PCR) after NCRT with an AUC of 0.812, sensitivity of 0.888, specificity of 0.740, negative predictive value (NPV) of 0.929, and positive predictive value (PPV) of 0.512. Ferrari et al. built an AI model based on MRI texture features to evaluate PCR in 55 LARC patients after NCRT, achieving an AUC of 0.86. A deep neural network (DNN) study of 95 patients predicted NCRT effect with 80% accuracy.

For response prediction using imaging, a logistic regression model applied to 136 rectal cancer patients showed progressive improvement: pre-NCRT AUC of 0.751 (66% sensitivity, 87.2% specificity), early post-treatment AUC of 0.831 (71% sensitivity, 86.1% specificity), and combined analysis AUC of 0.873 (75% sensitivity, 91.7% specificity). Feedforward neural networks (FFN), logistic regression, and SVM models applied to 226 LARC patients achieved accuracy of 0.67-0.75 and AUC of 0.76-0.83. Farrando et al. predicted LARC patient response by evaluating lncRNA expression, reaching an AUC of 0.93.

A multi-scale convolutional neural network (MSCNN) assessed 150 WSIs and achieved AUC of 0.9337 (Camelyon dataset) and 0.9091 (MSKCC dataset) for predicting NCRT effect. These AI-driven approaches demonstrate that combining imaging, pathology, and genomic data can guide more personalized treatment decisions, sparing patients from unnecessary therapy while ensuring those who need treatment receive optimal care.

TL;DR: AI predicts NCRT response with high accuracy: RAPIDS achieved AUC of 0.812 across 933 patients, combined imaging models reached AUC of 0.873, and lncRNA-based classifiers achieved AUC of 0.93. The DoMore-v1-CRC marker spares low-risk patients from unnecessary chemotherapy.
Pages 8-9
AI in Genetic Mutation Prediction and Precision Targeted Treatment

Targeted therapy is a key treatment approach for CRC, with the epithelial growth factor receptor (EGFR) serving as a vital drug target. The KRAS gene is highly sensitive to EGFR, making non-invasive prediction of KRAS mutation status critically important for treatment selection. Researchers used a deep learning method based on a residual neural network (ResNet) to predict KRAS mutation status, achieving a high AUC of 0.90 on the test set. This capability supports more precise targeted treatment decisions for CRC patients.

The BRAF gene mutation rate in CRC can reach as high as 10%. Beal et al. used a random forest (RF) model to predict the V600E mutation in BRAF, demonstrating that simpler, cost-effective models can deliver reliable genetic mutation detection. Russo et al. applied an AI-based prediction model to analyze patients who may develop drug resistance before and after treatment, achieving an average AUC of 0.90 for classification and targeted precision treatment. These findings confirm that AI-based genetic mutation detection is a reliable and potentially affordable approach.

At the genomic scale, Hu et al. studied competitive endogenous RNA (ceRNA) networks involving lncRNA and identified 144 core genes as potential drug targets for CRC treatment. Meanwhile, the molecular complex detection (MCODE) algorithm was used to extract gene expression profiles from databases, identifying 8,931 differentially expressed genes (DEGs) in CRC patients and discovering therapy targets. A computer-aided drug (CAD) design approach scanning 1,443 approved drugs targeted p53 for treatment, and a machine learning phenomics (MLP) model studying CRC cell gene expression showed a mean accuracy improvement of 9.48% over single-track approaches.

Abnormal gene and chromosome mutations can cause drug resistance, creating major obstacles to CRC treatment. AI-driven drug delivery platforms and precision targeting of resistance mechanisms offer promising solutions. A cascaded atrous convolution with spatial pyramid pooling (CAC-SPP) model evaluated tumor target segmentation with DSC values of 0.78 and 0.85, supporting precision radiotherapy planning. Together, these approaches demonstrate that AI at the genetic scale provides both theoretical support and practical tools for precision medicine in CRC.

TL;DR: AI predicts KRAS mutation status with AUC of 0.90 using ResNet, while RF models detect BRAF V600E mutations reliably. Genomic AI identified 144 potential drug target genes and 8,931 DEGs in CRC patients, supporting precision targeted therapy and drug resistance prediction.
Pages 9-10
Current Limitations and the Path Forward for AI in CRC

Data challenges: The development of AI requires three critical elements: big data, computing power, and algorithm models. In many countries, medical big data infrastructure is still in its early stages. More high-quality data is needed, and stronger interaction between data centers is essential. Without large volumes of high-quality data, even the most advanced algorithms will fall short. The authors emphasize the urgent need to standardize medical big data and increase interoperability among multiple centers.

Reproducibility concerns: Most AI models in CRC are based on retrospective data with strict admission and exclusion criteria. Imaging standards differ across centers, raising questions about model reproducibility and real-world effectiveness. The results obtained through deep learning still lack interpretability, creating the well-known "black box" problem. The algorithm cannot clearly explain its reasoning process or the internal information driving its decisions, which poses fundamental challenges for clinical trust and regulatory approval.

Interpretability efforts: Researchers have begun studying how DL models make decisions based on images and additional information, analyzing causal relationships within the black box. Shao et al. used a DNN model to predict one-year survival rates for over 20,000 patients after major cardiovascular surgery, defining an innovative "impact score" to explain model predictions. Although the field of interpretable deep learning is growing, transparency and interpretability of algorithms must remain core principles to ensure that medical staff, patients, and other stakeholders can fully understand clinical decisions.

The authors conclude that AI has provided broad prospects for CRC diagnosis and treatment in the era of precision medicine. However, AI is not meant to replace clinicians. Instead, stronger cooperation between clinicians and computer experts is needed to overcome transformation barriers. Continuous optimization of AI systems, combined with multiple medical imaging and molecular data, will improve early CRC detection rates, enable systematic patient evaluation, enhance adjuvant therapy effects, and strengthen prognosis monitoring. Evaluating clinicians' acceptance of different AI systems and minimizing interference in clinical workflows remain important considerations for successful adoption.

TL;DR: Key challenges include limited high-quality data, lack of multi-center standardization, retrospective study biases, and the "black box" interpretability problem. The path forward requires clinician-AI collaboration, standardized medical data, and interpretable models rather than full automation.
Citation: Yin Z, Yao C, Zhang L, Qi S.. Open Access, 2023. Available at: PMC10030915. DOI: 10.3389/fmed.2023.1128084. License: cc by.