“In recent artificial intelligence (AI)-based [computer-aided detection], the AI algorithm abstracts mammographic features as a descriptor. The difference between human-designed and self-learned descriptors is the main success factor of current deep learning algorithms. It has already been reported that AI can achieve similar performance to experts in medical image analysis,” the researchers noted.
The AI algorithm was trained with 170,230 mammography examinations obtained from five institutions in the United States, the UK, and South Korea: 36,468 cancer positive confirmed by biopsy, 59,544 benign confirmed by biopsy (8,827 mammograms) or follow-up imaging (50,717 mammograms), and 74,218 normal. Two institutions provided 320 mammograms (160 cancer positive, 64 benign, 96 normal) for the reader study. Participating radiologists (n=14) made the following assessments for each mammogram, first without and then with the AI’s assistance: likelihood of malignancy (LOM), location of malignancy, and necessity to recall the patient. The AI and radiologists were compared per LOM-based area under the receiver operating characteristic curve (AUROC) as well as recall-based sensitivity and specificity.
AI Algorithm Outperforms, Assists Radiologists
The overall AI algorithm standalone performance was AUROC 0.959 (95% confidence interval [CI], 0.952–0.966). Separate analyses were performed for each country’s dataset. The AUROC for the South Korea dataset was 0.970 (95% CI, 0.963–0.978), for the U.S. dataset was 0.953 (95% CI, 0.938–0.968), and for the UK dataset was 0.938 (95% CI, 0.918–0.958).
The AI algorithm performance level in the reader study (0.940; 95% CI, 0.915–0.965) significantly outperformed that of the radiologists without AI assistance (0.810; 95% CI, 0.770–0.850; P<0.0001). The radiologists’ performance level improved with the addition of AI assistance (0.881; 95% CI, 0.850–0.911; P<0.0001). AI, compared to radiologists, had higher sensitivity in detecting cancers with mass (53 [90%] vs. 46 [78%] of 59 cancers detected; P=0.044) or distortion/asymmetry (18 [90%] vs. 10 [50%] of 20 cancers detected; P=0.023); AI also more successfully detected T1 cancers (73 [91%] vs. 59 [74%] of 80; P=0.0039) and node-negative cancers (104 [87%] vs. 88 [74%] of 119; P=0.0025).
The study authors, whose work was published in The Lancet Digital Health, concluded that “the AI algorithm we developed with large-scale high-quality data showed better diagnostic performance than radiologists in breast cancer detection from mammograms. More importantly, the diagnostic performance of radiologists was significantly improved with the assistance of AI. This result shows that AI can be used as an effective diagnostic support tool for breast cancer detection, which is worth evaluating in prospective clinical trials.”