A combined approach of crowd innovation and artificial intelligence (AI) algorithms can rapidly produce tumor targeting results for radiation oncologists, according to a study published in JAMA Oncology.
To complete this study, researchers conducted a 10-week, prized-based, online, three-phased challenge. A total of 564 contestants registered for this challenge, 34 of whom submitted 45 algorithms. These were scored on an independent dataset, which the research team withheld from contestants. They rated the performance of each algorithm using quantitative metrics that assessed any overlap in algorithmic automated segmentations with expert’s segmentations. A clinical care expert generated a carefully-selected dataset for the contest, including computed tomography (CT) and lung tumor segmentation scans from 461 patients (median images per scan, 157 images per scan; 77,942 images in total; 8,144 images with tumor present).
In order to develop the algorithms, each contestant was supplied a training set of 229 CT scans with accompanying expert outlines while being provided with feedback on their performance through the duration of the contest from the study’s expert clinician.
The contest took place on Topcoder.com, a commercial platform that hosts thousands of online algorithm challenges for programmers to decode while competing for prizes. The participants learned about the underlying medical problem via online video and written materials showing the clinician demonstrating the manual lung tumor segmentation.
During each of the three study phases, the contestants used their algorithms to produce segmentations on the validation set and underwent real-time evaluation of their algorithm’s performance based on the segmentation scores (S scores), which were made visible on a public leaderboard. At the conclusion of each phase, prizes were issued to the contestants whose algorithms generated the highest S score.
Algorithms Prove Accurate
According to the study results, the automated segmentations produced by the top five AI algorithms, when combined using an ensemble model, showed an accuracy (Dice coefficient = 0.79) that was within the benchmark of mean interobserver variation assessed between six human experts. In the first phase, the top seven algorithms showed an average custom S scores on the holdout data set ranging from 0.15 to 0.38, and suboptimal performance using relative measures of error.
Moreover, the mean S scores for phase two increased to 0.53 to 0.57, displaying a similar improvement in other performance metrics. Finally, for the third phase, performance of the top algorithm increased by an additional 9%. Overall, using an ensemble model and combining the top five algorithms from phases two and three yielded an additional 9% to 12% improvement in performance with a final S score reaching 0.68.
“Using crowd innovation to generate clinically-relevant AI algorithms will allow sharing of scarce human expertise to under-resourced healthcare settings to improve the quality of cancer care globally,” the researchers wrote about the findings.
“These AI algorithms could improve cancer care globally by transferring the skills of expert clinicians to under-resourced healthcare settings,” the authors concluded.