Establishment of prediction models for COVID-19 patients in different age groups based on random Forest algorithm

This article was originally published here

QJM. 2021 Oct 20:hcab268. doi: 10.1093/qjmed/hcab268. Online ahead of print.


BACKGROUND: Coronavirus disease 2019 (COVID-19) has rapidly become a global pandemic. Age is an independent factor in death from the disease, and predictive models to stratify patients according to their mortality risk are needed.

AIM: To compare the laboratory parameters of the younger (≤70) and the elderly (>70) groups, and develop death prediction models for the two groups according to age stratification.

DESIGN: A retrospective, single-center observational study.

METHODS: This study included 437 hospitalized patients with laboratory-confirmed COVID-19 from Tongji Hospital in Wuhan, China, 2020. Epidemiological information, laboratory data, and outcomes were extracted from electronic medical records and compared between elderly patients and younger patients. First, recursive feature elimination (RFE) was used to select the optimal subset. Then, two random forest (RF) algorithms models were built to predict the prognoses of COVID-19 patients and identify the optimal diagnostic predictors for patients’ clinical prognoses.

RESULTS: Comparisons of the laboratory data of the two age groups revealed many different laboratory indicators. Recursive feature eliminatin (RFE) was used to select the optimal subset for analysis, from which 11 variables were screened out for the two groups. The RF algorithm were built to predict the prognoses of COVID-19 patients based on the best subset, and the area under ROC curve (AUC) of the two groups is 0.874 (95% CI : 0.833-0.915) and 0.842 (95% CI: 0.765-0.920).

CONCLUSION: Two prediction models for COVID-19 were developed in the patients with COVID-19 based on random forest algorithm, which provides a simple tool for the early prediction of COVID-19 mortality.

PMID:34668535 | DOI:10.1093/qjmed/hcab268