Adversarial training for prostate cancer classification using magnetic resonance imaging

Quant Imaging Med Surg. 2022 Jun;12(6):3276-3287. doi: 10.21037/qims-21-1089.


BACKGROUND: To use adversarial training to increase the generalizability and diagnostic accuracy of deep learning models for prostate cancer diagnosis.

METHODS: This multicenter study retrospectively included 396 prostate cancer patients who underwent magnetic resonance imaging (development set, 297 patients from Shanghai Jiao Tong University Affiliated Sixth People’s Hospital and Eighth People’s Hospital; test set, 99 patients from Renmin Hospital of Wuhan University). Two binary classification deep learning models for clinically significant prostate cancer classification [PM1, pretraining Visual Geometry Group network (VGGNet)-16-based model 1; PM2, pretraining residual network (ResNet)-50-based model 2] and two multiclass classification deep learning models for prostate cancer grading (PM3, pretraining VGGNet-16-based model 3; PM4: pretraining ResNet-50-based model 4) were built using apparent diffusion coefficient and T2-weighted images. These models were then retrained with adversarial examples starting from the initial random model parameters (AM1, adversarial training VGGNet-16 model 1; AM2, adversarial training ResNet-50 model 2; AM3, adversarial training VGGNet-16 model 3; AM4, adversarial training ResNet-50 model 4, respectively). To verify whether adversarial training can improve the diagnostic model’s effectiveness, we compared the diagnostic performance of the deep learning methods before and after adversarial training. Receiver operating characteristic curve analysis was performed to evaluate significant prostate cancer classification models. Differences in areas under the curve (AUCs) were compared using Delong’s tests. The quadratic weighted kappa score was used to verify the PCa grading models.

RESULTS: AM1 and AM2 had significantly higher AUCs than PM1 and PM2 in the internal validation dataset (0.84 vs. 0.89 and 0.83 vs. 0.87) and test dataset (0.73 vs. 0.86 and 0.72 vs. 0.82). AM3 and AM4 showed higher κ values than PM3 and PM4 in the internal validation dataset {0.266 [95% confidence interval (CI): 0.152-0.379] vs. 0.292 (95% CI: 0.178-0.405) and 0.254 (95% CI: 0.159-0.390) vs. 0.279 (95% CI: 0.163-0.396)} and test set [0.196 (95% CI: 0.029-0.362) vs. 0.268 (95% CI: 0.109-0.427) and 0.183 (95% CI: 0.015-0.351) vs. 0.228 (95% CI: 0.068-0.389)].

CONCLUSIONS: Using adversarial examples to train prostate cancer classification deep learning models can improve their generalizability and classification abilities.

PMID:35655831 | PMC:PMC9131330 | DOI:10.21037/qims-21-1089