Three machine learning models demonstrated similar performance in identifying patients at high risk for gastrointestinal bleeding (GIB) who are prescribed antithrombotic agents, and 2 models were modestly better than the commonly used HAS-BLED score, according to a study in JAMA Network Open.1
Investigators compared the performance of 3 machine learning approaches — regularized Cox regression (RegCox), random survival forests (RSF), and extreme gradient boosting (XGBoost) — with the hypertension, abnormal kidney and liver function, stroke, bleeding, labile international normalized ratio, older age, and drug or alcohol use (HAS-BLED) risk score. They used medical and pharmacy claims data from the OptumLabs Data Warehouse to identify patients aged 18 years or older who were prescribed antithrombotic drugs from January 1, 2016, to December 31, 2019.
The overall study population of 306,463 patients were grouped into a development cohort (105,837 patients) and a validation cohort (200,626 patients) based on whether the index prescription date was before (development) or after (validation) January 1, 2019. The researchers computed the sensitivity and specificity for all possible classification thresholds and constructed receiver operating characteristic (ROC) curves using the predicted probability of GIB at 6 and 12 months from each model.
In the overall cohort, 12,322 (4.0%) patients had a GIB event during a median follow-up of 133 days (interquartile range, 49-320 days). Participants’ mean age was 69.0 (SD, 12.6) years, 166,177 (54.2%) were men, and 193,648 (63.2%) were White.
The HAS-BLED score had an area under the ROC curve (AUC) of 0.61 (95% CI, 0.59-0.62) for predicting 6-month GIB risk, and an AUC of 0.60 (95% CI, 0.59-0.61) for predicting 12-month GIB risk. The 3 machine learning models had similar AUCs. RegCox had an AUC of 0.68 (95% CI, 0.66-0.70) and 0.67 (95% CI, 0.65-0.69) for predicting GIB at 6 and 12 months, respectively.
In the validation cohort, the RegCox and XGBoost models also demonstrated good performance, according to the investigators (RegCox: 6-month AUC, 0.67; 12-month AUC, 0.66; XGBoost: 6-month AUC, 0.67; 12-month AUC, 0.66), compared with the HAS-BLED score (6-month AUC, 0.60; 12-month AUC, 0.59). Similar performance was found regarding accuracy, sensitivity, specificity, and positive predictive value.
The variables with the highest importance scores in the RegCox model were previous GI bleeds (0.72), combination of atrial fibrillation, ischemic heart disease, and venous thromboembolism (0.38), and use of gastroprotective agents (0.32).
The study authors noted a limitation to their findings, as the OptumLabs Data Warehouse does not include patients who are uninsured or are insured by Medicare.
“A prospective evaluation of the best-performing model may improve understanding of the clinical impact of using machine learning to predict the risk of GIB in patients who use antithrombotic drugs,” the researchers commented.1
In an accompanying editorial, Fei Wang, PhD, wrote: “The study by Herrin et al demonstrated on a large real-world patient claims data set that [machine learning] (ML) models can perform better than clinically used risk predictor tools on GIB, which implies the great potential of ML on predicting rare clinical outcomes.”2
Disclosures: One of the study authors declared an affiliation with an artificial intelligence company and a medical device company. Dr Wang declared an affiliation with a technology company, pharmaceutical company, and chemical industry company. Please see the original reference for a full list of disclosures.
References
1. Herrin J, Abraham NS, Yao X, et al. Comparative effectiveness of machine learning approaches for predicting gastrointestinal bleeds in patients receiving antithrombotic treatment. JAMA Netw Open. 2021;4(5):e2110703. doi: 10.1001/jamanetworkopen.2021.10703
2. Wang F. Machine learning for predicting rare clinical outcomes—finding needles in a haystack. JAMA Netw Open. 2021;4(5):e2110738. doi: 10.1001/jamanetworkopen.2021.10738