Evaluating the Accuracy of Risk Assessment Models for Venous Thromboembolism

A study found that risk assessment models (RAMs) have weak predictive accuracy for venous thromboembolism (VTE). The results were published in BMJ Open.

RAMs are developed and used to stratify the risk of VTE among hospitalized patients. They operate by using clinical information from the patient’s health and exam history to identify those at the highest risk of developing VTE, and those who might benefit the most from preventative therapy. The researchers of this study wrote that: “While RAMs could improve the ratio of benefit to risk and benefit to cost, it is unclear which VTE RAM should be applied to guide decision-making for prophylaxis in clinical practice and thereby optimize patient care.”

Previous reviews have have identified the use of various RAMS for stratifying VTE in hospitalized patients, but these reviews failed to yield any evidence as to which RAM was superior.

In this systematic search across five electronic databases (including MEDLINE, EMBASE and the Cochrane Library) from inception to February 2021, researchers identified a total of 6,355 records, including 51 studies, and consisting of 24 unique validated RAMs. Studies were deemed as eligible if they examined the accuracy of a multivariable RAM (or scoring system) for predicting the risk of developing VTE in hospitalized inpatients. Most of the studies comprised of hospital inpatients who required medical care (21 studies), were undergoing surgery (15 studies) or receiving care for trauma (4 studies).

Findings, Strengths, and Limitations

According to the results, the most widely evaluated RAMs were the Caprini RAM (22 studies), Padua prediction score (16 studies), IMPROVE models (8 studies), the Geneva risk score (4 studies) and the Kucher score (4 studies). The researchers observed from their analysis that no one RAM notably outperformed the others. Across all models, C-statistics (the measure of assessing the models) were often weak (<0.7), sometimes good (0.7–0.8) and a few were excellent (>0.8). Similarly, the researchers noted, estimates for sensitivity and specificity were highly variable. Sensitivity estimates ranged from 12.0% to 100% and specificity estimates ranged from 7.2% to 100%.

This systematic review has a number of strengths. The review was conducted with robust methodology in accordance with the PRISMA statement and the protocol was registered with the PROSPERO register. Clinical experts were involved throughout as checkers and to assess the validity and applicability of research during the review. We reported descriptive statistics to provide insight into the limited evidence base applicable to the subject matter, and the scientific concerns regarding validity of the data.

Despite the strengths of the study, such as its robust methodology, and descriptive statistics, the study did have its limitations. For instance, decision-making on study relevance, information gathering and validity were unblinded, and therefore open to potential bias. Also, these studies assessing risk prediction were a combination of prospective cohorts and retrospective health database registries; two methodologies that carry significant limitations, the researchers further noted. They wrote that: Retrospective studies of health database registries may have large numbers but may be limited by poor data quality and failure to accurately ascertain outcomes. Prospective cohorts may have better quality data but with smaller numbers lack statistical power. The included studies demonstrated high levels of heterogeneity so we were unable to undertake any meta-analysis.”


“We identified a number of validated RAMs for potential risk stratification of hospitalized inpatients. The available evidence is insufficient to recommend one over another,” the researchers concluded.